Facebook with Latestnigeriannews  Twieet with latestnigeriannews  RSS Page Feed
Home  |  All Headlines  |  Punch  |  Thisday  |  Daily Sun  |  Vanguard   |  Guardian  |  The Nation  |  Daily Times  |  Daily Trust  |  Daily Independent
World  |  Sports  |  Technology  |  Entertainment  |  Business  |  Politics  |  Tribune  |  Leadership  |  National Mirror  |  BusinessDay  |  More Channels...

Viewing Mode:

Archive:

  1.     Tool Tips    
  2.    Collapsible   
  3.    Collapsed     
Click to view all Entertainment headlines today

Click to view all Sports headlines today

Harvard Is Releasing a Massive Free AI Training Dataset Funded by OpenAI and Microsoft

Published by Slashdot on Thu, 12 Dec 2024


Harvard University announced Thursday it's releasing a high-quality dataset of nearly one million public-domain books that could be used by anyone to train large language models and other AI tools. From a report: The dataset was created by Harvard's newly formed Institutional Data Initiative with funding from both Microsoft and OpenAI. It contains books scanned as part of the Google Books project that are no longer protected by copyright. Around five times the size of the notorious Books3 dataset that was used to train AI models like Meta's Llama, the Institutional Data Initiative's database spans genres, decades, and languages, with classics from Shakespeare, Charles Dickens, and Dante included alongside obscure Czech math textbooks and Welsh pocket dictionaries. Greg Leppert, executive director of the Institutional Data Initiative, says the project is an attempt to "level the playing field" by giving the general public, including small players in the AI industry and individual researchers, access to the sort of highly-refined and curated content repositories that normally only established tech giants have the resources to assemble. "It's gone through rigorous review," he says. Leppert believes the new public domain database could be used in conjunction with other licensed materials to build artificial intelligence models. "I think about it a bit like the way that Linux has become a foundational operating system for so much of the world," he says, noting that companies would still need to use additional training data to differentiate their models from those of their competitors.Read more of this story at Slashdot.
Click here to read full news..

All Channels Nigerian Dailies: Punch  |  Vanguard   |  The Nation  |  Thisday  |  Daily Sun  |  Guardian  |  Daily Times  |  Daily Trust  |  Daily Independent  |   The Herald  |  Tribune  |  Leadership  |  National Mirror  |  BusinessDay  |  New Telegraph  |  Peoples Daily  |  Blueprint  |  Nigerian Pilot  |  Sahara Reporters  |  Premium Times  |  The Cable  |  PM News  |  APO Africa Newsroom

Categories Today: World  |  Sports  |  Technology  |  Entertainment  |  Business  |  Politics  |  Columns  |  All Headlines Today

Entertainment (Local): Linda Ikeji  |  Bella Naija  |  Tori  |  Daily News 24  |  Pulse  |  The NET  |  DailyPost  |  Information Nigeria  |  Gistlover  |  Lailas Blog  |  Miss Petite  |  Olufamous  |  Stella Dimoko Korkus Blog  |  Ynaija  |  All Entertainment News Today

Entertainment (World): TMZ  |  Daily Mail  |  Huffington Post

Sports: Goal  |  African Football  |  Bleacher Report  |  FTBpro  |  Soft Football  |  Kickoff  |  All Sports Headlines Today

Business & Finance: Nairametrics  |  Nigerian Tenders  |  Business Insider  |  Forbes  |  Entrepreneur  |  The Economist  |  BusinessTech  |  Financial Watch  |  BusinessDay  |  All Business News Headlines Today

Technology (Local): Techpoint  |  TechMoran  |  TechCity  |  Innovation Village  |  IT News Africa  |  Technology Times  |  Technext  |  Techcabal  |  All Technology News Headlines Today

Technology (World): Techcrunch  |  Techmeme  |  Slashdot  |  Wired  |  Hackers News  |  Engadget  |  Pocket Lint  |  The Verge

International Networks:   |  CNN  |  BBC  |  Al Jazeera  |  Yahoo

Forum:   |  Nairaland  |  Naij

Other Links: Home   |  Nigerian Jobs