B\"uy\"uk dil modellerinin T\"urkce verisetleri ile egitilmesi ve ince ayarlanmas
Abstract
Large language models have advanced enormously, gained vast attraction and are having a phase of intensed research. Some of the developed models and training datasets have been made open-accessible. Hence these may be further fine-tuned with some techniques to obtain specialized models for specific tasks. When it comes to Turkish language, open-access models do not provide satisfactory coverage. This is also observed over published datasets. In this work, we propose some ideas to mitigate this issue: creating large Turkish datasets, training LLMs with these and fine-tuning pre-trained models with Turkish inputs. We report our findings on Turkish-based trainings with the problems encountered along the way. We conclude with outcomes of these experiments and propose ideas for further works. -- B\"uy\"uk dil modelleri inanlmaz \"olc\"ude gelismekte, b\"uy\"uk ilgi toplayarak ve \"uzerlerinde yogun arastirmalarin yapildigi bir d\"onemdedirler. Gelistirilen modeller ve egitimde kullanilan verisetlerinden bazilari acik erisimli olarak sunulmaktadir. B\"oylece ince ayarlama teknikleri uygulayarak \"ozellesmis g\"orevler icin calisabilir modeller elde edilmektedir. T\"urkce s\"oz konusu oldugunda bu modellerinin kapsayiciligi yeterli d\"uzeyde degildir. Bu durum, yayimlanan verisetlerinde de g\"ozlemlenebilir. Bunu asmanin yollari T\"urkce icerikli b\"uy\"uk verisetlerinin olusturulmasi, b\"uy\"uk dil modellerinin bunlarla egitilmesi ve \"onceden egitilmis modellerin T\"urkce girdilerle ince ayarlanmalari olabilir. Bu calismada acik erisimli dil modelleri ve verisetleri \"uzerinde durulmakta ve T\"urkce temelli bazi deneyler, karsilasilan sorunlar ve sonuclar irdelenmektedir.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.