I recently did some tinkering with creating a Snowflake UDF in Python. It turns out that I discovered a Python Snowflake UDF Quoting issue on the return value from the function. I was not able to find any information on this issue online so I figured it was good to document it for anyone else that might run into the same problem. It turns out that this also impacted a Javascript UDF that I had also written. Here’s the best part, the fix is the same for both UDF types!
The UDF
My original code was a little more complex than the sample below but it produces the same problem. I created a Python Snowflake UDF to do some manipulation of input and return a result that was the manipulated output.
CREATE OR REPLACE FUNCTION python_test_output(field_value variant)
returns variant
language python
runtime_version = '3.8'
handler = 'mask_wrapper'
as
$$
def mask_wrapper(field_value):
try:
return field_value
except Exception as e:
return "Exception : " + str(e)
$$;
The nice thing about Python on Snowflake is that it DOES support overloading so in theory, I could’ve created multiple functions like:
CREATE OR REPLACE FUNCTION python_test_output(field_value DECIMAL)
returns DECIMAL
language python
runtime_version = '3.8'
handler = 'mask_wrapper'
as
$$
def mask_wrapper(field_value):
try:
return field_value
except Exception as e:
return NULL
$$;
CREATE OR REPLACE FUNCTION python_test_output(field_value TEXT)
returns TEXT
language python
runtime_version = '3.8'
handler = 'mask_wrapper'
as
$$
def mask_wrapper(field_value):
try:
return field_value
except Exception as e:
return "Exception : " + str(e)
$$;
...
I was not able to do this in my setup because it would’ve meant repeating the bulk of my code over and over in those functions. In other database (yeah I know snowflake is a warehouse but it takes sql so plain old “database” it is) platforms, I’ve been able to create these wrapper type functions and have them call a central function where I store the majority of my logic. I wasn’t sure how to accomplish this in Snowflake so I used the VARIANT
data type instead so that I could accept just about any input.
The Problem
The problem is slightly annoying in that my UDF had an issue with quotes being returned for TEXT
data types. Let’s first look at a simple example of a “normal” query result in Snowflake
select 'test';
-------
test
The result is pretty normal, test
. Now let’s look at something that ran through my UDF
select python_test_output(to_variant('test'));
-------
"test"
Notice the difference in the result? This is the Python Snowflake UDF quoting issue that I’m referring to. In this result, the string is "test"
instead of test
. While this isn’t a huge deal, if I had something attempting to pattern match on the string, it would end up failing because the quotes were unexpected. There were other issues that I ran into but this was the most glaring and reproducible.
The Solution
The solution ended up being pretty simple. All I need to do was cast the results of my function to something like TEXT
for output that is supposed to be TEXT
.
select python_test_output(to_variant('test'))::TEXT;
-------
test
this same method also worked on other data types like NUMERIC
…etc..The solution to my Python Snowflake UDF quoting issue ended up being pretty simple but it took me forever to figure it out!
Hopefully, my hours of madness will help someone else