cognitive cybersecurity intelligence

News and Analysis

Search

GitHub releases an open dataset for multilingual developer content

GitHub releases an open dataset for multilingual developer content

Developers coordinate code across README files, issue threads, and pull request discussions. Much of that exchange happens in English, and a large share happens in other languages. GitHub has released a dataset built to help researchers and developers locate public repositories that carry non-English natural-language content. The GitHub Multilingual Repositories Dataset is available on GitHub under the CC0-1.0 license. The release follows a commitment GitHub made in 2025 as part of Microsoft’s European Digital Commitments … More →
The post GitHub releases an open dataset for multilingual developer content appeared first on Help Net Security.

Source: www.helpnetsecurity.com –

Subscribe to newsletter

Subscribe to HEAL Security Dispatch for the latest healthcare cybersecurity news and analysis.

More Posts