On October 5, our colleague Alexey Mironov presented the article "Russian Web Tables: A Public Corpus of Web Tables for Russian Language Based on Wikipedia" at the DAMDID/RCDL'22 conference. The article presents the first Russian-language corpus of Wikipedia tables, as well as a toolkit for its construction. This corpus will enable further research on machine learning topics such as determining the semantic type of a column, automatically filling in missing values, extracting knowledge, and many more.
The talk was met positively, and as a result, the conference chairs decided to publish the article in a Springer journal. The proceedings of the conference have not yet been published, but a preprint is available, which can be found at the link below.
The DAMDID/RCDL conference is one of the largest and oldest Russian conferences on information management. This year it was held in St. Petersburg, at the ITMO University. Links
Conference website: https://damdid2022.frccsc.ru/
About the conference: https://synthesis .frccsc.ru/rcdl.html
Preprint of the article: https://arxiv.org/abs/2210.6 353