about summary refs log tree commit diff
path: root/corp/russian/data-import/src/db_setup.rs (follow)
AgeCommit message (Collapse)AuthorFilesLines
2023-01-25 r/5755 fix(corp/data-import): `rank` is an integer fieldVincent Ambo1-1/+1
Change-Id: Ifc9cd46e5b5521096db19628bd8bcf026106dcc9 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7926 Reviewed-by: tazjin <tazjin@tvl.su> Autosubmit: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-01-22 r/5732 feat(corp/data-import): add import of OR 'translations' tableVincent Ambo1-0/+44
The original dataset contains translations into different languages, but only the English ones are imported here. Note that translations are for lemmata only. Change-Id: Ifb9c32c25fda44c38ad899efca9d205c520c0fa3 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7895 Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-01-21 r/5730 feat(corp/data-import): add import of OR 'words_forms' tableVincent Ambo1-1/+38
This is the full morphological set table for all the words from the lemmata table, which they don't call it that. Change-Id: I6f5be673c5f59f11e36bd8c8c935844a7d4fd170 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7894 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-01-21 r/5729 feat(corp/data-import): add import of OpenRussian 'words' tableVincent Ambo1-1/+50
This is actually the lemmata table of this corpus, not the forms of all words (they're in a separate table). Change-Id: I89a2c2817ccce840f47406fa2a636f4ed3f49154 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7893 Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-01-18 r/5702 chore(corp/data-import): namespace tables for OpenCorpora dataVincent Ambo1-21/+21
I'm changing strategies to importing both OC and another dataset before continuing to normalise the data, as it might be easier to do in a set of table-constructing queries inside of SQLite with all raw data in place. Change-Id: I26b41af80586fc1bfd8e26a6be20579068a82507 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7872 Autosubmit: tazjin <tazjin@tvl.su> Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-01-18 r/5691 feat(corp/data-import): parse and import linksVincent Ambo1-0/+24
Change-Id: Iebdbc8f884f28064d7b00b8f8808b5030fa3d05c Reviewed-on: https://cl.tvl.fyi/c/depot/+/7864 Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-01-18 r/5690 feat(corp/data-import): parse and import link typesVincent Ambo1-0/+16
Change-Id: Iae01d1dc6894117dc693b4690d8bc79861212ae6 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7863 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-01-18 r/5688 feat(corp/data-import): insert OpenCorpora data into SQLiteVincent Ambo1-0/+128
This is an initial and kind of dumb table structure, but there's some massaging that needs to be done before this makes more sense. Change-Id: I441288b684ef86be507099bcc4ebf984598789c8 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7861 Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI