otcws-customized增量训练
下载源码v3.4
https://ltp.readthedocs.io/zh_CN/v3.4.0/install.html
编译
安装visual studio 2017 cmake 选择visual studio 2017编译器 visual studio 2017 release发布 运行
增量训练
D:xxprogramltp-3.4.0 ools rainRelease
cmd
D:xxprogramltp-3.4.0 ools rainRelease>otcws.exe customized-learn --baseline-model D:xxprogramltp_data_v3.4.0cws.model --model D:xxprogramltp_data_v3.4.0mycws.model --reference D:xxprogramltp-3.4.0 ools rainsamplesegexample-train.seg --development D:xxprogramltp-3.4.0 ools rainsamplesegexample-holdout.seg
参考 https://github.com/HIT-SCIR/ltp-cws
代码
# 个性化分词 from pyltp import CustomizedSegmentor customized_segmentor = CustomizedSegmentor() # 加载模型,第二个参数是您的增量模型路径 customized_segmentor.load(cws_model_path, /path/to/your/customized_model) words_2 = customized_segmentor.segment(你会机器学习中的随机森林吗) print( .join(words_2)) customized_segmentor.release()
# 个性化模型使用外部字典 customized_segmentor_1 = CustomizedSegmentor() # 初始化实例 customized_segmentor_1.load_with_lexicon(cws_model_path, /path/to/your/customized_model, /path/to/your/lexicon) # 加载模型 words_3 = customized_segmentor_1.segment(你会机器学习中的随机森林吗) print( .join(words_3)) customized_segmentor.release()
上一篇:
IDEA上Java项目控制台中文乱码