Skip to content

Data and code for "Analyzing the GitHub Repositories of Research Papers"

Notifications You must be signed in to change notification settings

michaelfaerber/paper-github-analysis

Repository files navigation

Analyzing the GitHub Repositories of Research Papers

Methodology & Dataset

This repository contains the source code and dataset used to analyze all GitHub repositories linked in scientific papers. The dataset was obtained by querying the Microsoft Academic Graph, which is licensed under ODC-By. Our analysis focuses on several dimensions related to these repositories and their associated papers.

Results

Our analysis reveals that both the number of stars and forks across repositories follow a power-law distribution. Typically, only one author of the paper contributes to the associated repository. Most GitHub manuals are concise, often comprising only a few sentences. The majority of the source code is written in Python, and the papers linking to these repositories, along with their authors, predominantly belong to the AI field.

More Information & Citation

For more details, please refer to the following paper:

Michael Färber: "Analyzing the GitHub Repositories of Research Papers." Proceedings of the 2020 ACM/IEEE Joint Conference on Digital Libraries (JCDL'20), Xi'an, China, 2020. Link to paper.

Please cite this paper if you reference our work.

Acknowledgements

We would like to thank Erhan Metin for his valuable contributions to this research.

About

Data and code for "Analyzing the GitHub Repositories of Research Papers"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages