![]() We plot the below figure (in the style of Figure 2 in the self-instruct paper to demonstrate the diversity of our data. In a preliminary study, we also find our 52K generated data to be much more diverse than the data released by self-instruct. This produced an instruction-following dataset with 52K examples obtained at a much lower cost (less than $500). We only generated a single instance for each instruction, instead of 2 to 3 instances as in.We simplified the data generation pipeline by discarding the difference between classification and non-classification instructions.We adopted much more aggressive batch decoding, i.e., generating 20 instructions at once, which significantly reduced the cost of data generation.Note: there is a slight error in the prompt we used, and future users should incorporate the edit in #24 We wrote a new prompt ( prompt.txt) that explicitly gave the requirement of instruction generation to text-davinci-003.We used text-davinci-003 to generate the instruction data instead of davinci.We built on the data generation pipeline from self-instruct and made the following modifications: Run python -m generate_instruction generate_instruction_following_data to generate the data.Install the dependencies with pip install -r requirements.txt.Set environment variables OPENAI_API_KEY to your OpenAI API key.Write a response that appropriately completes the request.ĭuring inference (eg for the web demo), we use the user instruction with an empty input field (second option). for examples with a non-empty input field:īelow is an instruction that describes a task.We used the following prompts for fine-tuning the Alpaca model: output: str, the answer to the instruction as generated by text-davinci-003.Around 40% of the examples have an input. For example, when the instruction is "Summarize the following article", the input is the article. input: str, optional context or input for the task.instruction: str, describes the task the model should perform.This JSON file is a list of dictionaries, each dictionary contains the following fields: Data ReleaseĪlpaca_data.json contains 52K instruction-following data we used for fine-tuning the Alpaca model. Smith, Daniel Khashabi, Hannaneh Hajishirzi. Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. : Self-Instruct: Aligning Language Model with Self Generated Instructions. Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample. : LLaMA: Open and Efficient Foundation Language Models. Please read our release blog post for more details about the model, our discussion of the potential harm and limitations of Alpaca models, and our thought process for releasing a reproducible model. For now, we have chosen to host a live demo to help readers better understand the capabilities and limits of Alpaca, as well as a way to help us better evaluate Alpaca's performance on a broader audience. We intend to release the model weights if we are given permission to do so by the creators of LLaMA. Our initial release contains the data generation procedure, dataset, and training recipe. We thus encourage users to be cautious when interacting with Alpaca, and to report any concerning behavior to help improve the safety and ethical considerations of the model. ![]() Importantly, we have not yet fine-tuned the Alpaca model to be safe and harmless. Īlpaca is still under development, and there are many limitations that have to be addressed. In a preliminary human evaluation, we found that the Alpaca 7B model behaves similarly to the text-davinci-003 model on the Self-Instruct instruction-following evaluation suite. The current Alpaca model is fine-tuned from a 7B LLaMA model on 52K instruction-following data generated by the techniques in the Self-Instruct paper, with some modifications that we discuss in the next section. Our live demo is suspended until further notice. Note: We thank the community for feedback on Stanford-Alpaca and supporting our research. The 52K data used for fine-tuning the model.This is the repo for the Stanford Alpaca project, which aims to build and share an instruction-following LLaMA model. Stanford Alpaca: An Instruction-following LLaMA Model
0 Comments
Leave a Reply. |