Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs-v1.latitude.so/llms.txt

Use this file to discover all available pages before exploring further.

While Latitude focuses on prompt engineering and evaluation, the datasets you create and curate within the platform can be valuable assets for fine-tuning language models using external tools and services.

Why Use Latitude Datasets for Fine-tuning?

  • Curated Data: Datasets often contain carefully selected inputs and high-quality outputs (either expected outputs or actual model responses reviewed manually).
  • Real-World Examples: Datasets created from production logs represent actual user interactions.
  • Structured Format: Latitude datasets are already in a structured format (CSV), making them easier to process for fine-tuning.

Exporting Datasets from Latitude

  1. Navigate to the “Datasets” section in your project.
  2. Locate the dataset you want to use for fine-tuning.
  3. Find the option to Download or Export the dataset (usually represented by a download icon).
  4. Save the resulting CSV file to your local machine.

Preparing Data for Fine-tuning

Once exported, you’ll likely need to transform the CSV data into the specific format required by your chosen fine-tuning platform or library (e.g., OpenAI’s JSONL format, Hugging Face datasets format). Common steps include:
  1. Selecting Columns: Identify the columns containing the input prompt/context and the desired completion/output.
  2. Formatting: Convert each row into the required structure. For example, for OpenAI fine-tuning, you might create JSON objects like:
    {"prompt": "<Input from CSV column A>", "completion": "<Output from CSV column B>"}
    // or for chat models:
    {"messages": [{"role": "system", "content": "..."}, {"role": "user", "content": "<Input>"}, {"role": "assistant", "content": "<Output>"}]}
    
  3. Data Cleaning: Review the data for quality, consistency, and remove any low-quality or irrelevant examples.
  4. Splitting Data: You might need to split your exported dataset into training and validation sets.
Consult the documentation of your specific fine-tuning tool or platform for detailed formatting requirements.

Example Scenario

Imagine you have a Latitude dataset created from manually reviewed chat logs (input_query, high_quality_response).
  1. Export: Download this dataset as a CSV from Latitude.
  2. Transform: Write a script (e.g., Python with pandas) to read the CSV and convert each row into the JSONL format required by the fine-tuning API you plan to use.
  3. Fine-tune: Upload the formatted JSONL file and run the fine-tuning job using the provider’s tools.
  4. Evaluate: After fine-tuning, you can even evaluate the new model’s performance back in Latitude by configuring it as a new provider/model and running evaluations against your datasets.
By leveraging the data curation work done in Latitude, you can streamline the preparation process for fine-tuning models for specialized tasks.

Next Steps

  • Learn about Creating and Using Datasets in Latitude.
  • Refer to external documentation for specific fine-tuning platforms (OpenAI, Hugging Face, etc.).