- Saheed Azeez created Naijaweb, a dataset of 230 million GPT2 tokens based on Nairaland, by learning web scraping and data cleaning skills.
- He faced challenges but eventually succeeded with the help of tools like Hugging Face, and now aims to train a large language model using the dataset.
No comments yet
Be the first to share your opinion!