开发者

Is it possible to tokenize text in PL/PGSQL using regular expressions?

开发者 https://www.devze.com 2023-04-07 07:31 出处:网络
I want to tokenize text开发者_JS百科 in my database with RegEx and store the resulting tokens in a table. First I want to split the words by spaces, and then each token by punctuation.

I want to tokenize text开发者_JS百科 in my database with RegEx and store the resulting tokens in a table. First I want to split the words by spaces, and then each token by punctuation.

I'm doing this in my application, but executing it in the database might speed it up.

Is it possible to do this?


There is a number of functions for tasks like that.
To retrieve the 2nd word of a text:

SELECT split_part('split this up', ' ', 2);

Split the whole text and return one word per row:

SELECT regexp_split_to_table('split this up', E'\\s+');

Actually, the last example splits on any stretch of whitespace.)

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号