lionsoul2014/friso
High performance Chinese tokenizer with both GBK and UTF-8 charset support based on MMSEG algorithm developed by ANSI C. Completely based on modular implementation and can be easily embedded in other programs, like: MySQL, PostgreSQL, PHP, etc.
505
CApache License 2.0cchinese-tokenizerchinese-word-segmentationcjk-tokenizerfull-text-searchjapanese-tokenizerkorean-tokenizerphp-tokenizertokenizer
Stars
505
Updated
Oct 31, 2025
Stars Over Time
Top Contributors
Related Repositories
Track developers from lionsoul2014/friso
Join 1,000+ companies finding quality developer leads