A collection of datasets about Wikipedia and other projects run by the Wikimedia Foundation. The collection is open to contributions by researchers not affiliated with the Foundation.
Our overall data policy is to release into the public domain all datasets that don't require attribution and to license datasets that include textual/media contributions from Wikimedians under the appropriate open license, most commonly a CC BY 3.0 license.
Datasets
2 datasets found.
-
Wikipedia dumps of full content of wikipedia. Database backup dumps - A complete copy of all Wikimedia wikis, in the form of wikitext source and metadata embedded in XML. A number of...
-
This is real, accurate hourly snapshot data on the access to Wikipedia captured from the Wikimedia Squid servers. Project counts show the total access in a time period to the different...