Age | Commit message (Collapse) | Author | Files | Lines | |
---|---|---|---|---|---|
2023-01-21 | r/5729 feat(corp/data-import): add import of OpenRussian 'words' table | Vincent Ambo | 1 | -0/+115 | |
This is actually the lemmata table of this corpus, not the forms of all words (they're in a separate table). Change-Id: I89a2c2817ccce840f47406fa2a636f4ed3f49154 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7893 Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI | |||||
2023-01-18 | r/5683 feat(corp/russian/data-import): new OpenCorpora data import tool | Vincent Ambo | 1 | -0/+384 | |
Adds the beginning of a tool which can import OpenCorpora data into a SQLite database. This is quite a lot of toil and there's probably a better way to do this, but overall becoming this intimately familiar with the data structures is quite helpful for understanding what I can/can't do with only this dataset. Change-Id: Ieab33a8ce07ea4ac87917b9c8132226bbc6523b1 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7859 Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI |