Getting started with EDSL


This page provides guidance and examples of how to use EDSL, the Expected Parrot Domain-Specific Language: an open source Python library for simulating surveys and social science research with AI. 

If you are looking for information about our customizable applications, please get in touch or request access.


Introduction

EDSL is an open source Python package for simulating surveys and experiments, social science and market research, qualitative analysis, data labeling and other tasks using large language models. 

EDSL is created and distributed with the MIT License by Expected Parrot. It is inspired by research by Expected Parrot co-founders and others showing that AI agents can be used to stand in for humans in certain kinds of empirical researcha new Homo Silicus.

Below you will find details on installing EDSL, example code and links to demo notebooks for getting started and exploring use cases. 

Please also see our docs for more details on how EDSL works and example code.

We're excited to launch EDSL and hope that users will join our Discord to share feedback, insights and ideas! 

Join our Discord to connect with other users and learn about new features!

Getting started

Requirements

Installation

Download the latest version of EDSL on PyPI: https://pypi.org/project/edsl/ 

Get the latest updates at GitHub: https://github.com/goemeritus/edsl/


Install the package:

pip install edsl

Update your version:

pip install --upgrade edsl

Check your version:

pip show edsl

API keys

You will be prompted to provide API keys when you first access the package. You can skip this step by pressing return/enter. API keys are not required to construct surveys using edsl; however, you will need an API key in order to simulate responses using LLMs. The prompt will look like this:

==================================================

Please provide your OpenAI API key (https://platform.openai.com/api-keys).

If you would like to skip this step, press enter.

If you would like to provide your key, do one of the following:

1. Set it as a regular environment variable

2. Create a .env file and add `OPENAI_API_KEY=...` to it

3. Enter the value below and press enter:

Note: The EDSL API is coming soon! It will allow you to access all of the available LLMs with a single key managed by Expected Parrot.

A quick example

Steps to create a question, administer it to an LLM and inspect the result:

# Select a question type

from edsl.questions import QuestionMultipleChoice


# Construct a question

q = QuestionMultipleChoice(

    question_name = "example_question",

    question_text = "How do you feel today?",

    question_options = ["Bad", "OK", "Good"]

)


# Run the question using the default LLM (GPT-4)

results = q.run()


# View the results

results.select("example_question").print()

Please see our Starter Tutorial for more details and other examples.

Key concepts

EDSL is built around the concept of a Question that is answered by an AI Agent using a large language Model, generating a Result that can be analyzed, visualized and shared, or used to inform other questions. Questions of various types (free text, multiple choice, etc.) can be combined into a Survey and run in parallel or according to specified rules or skip logic (e.g., answer the next question based on a responses to a prior question). A question can also be parameterized with a Scenario that provides context or data to the question when it is run allowing us to administer multiple versions of a question at once. (This is a useful way to use EDSL to conduct data labeling tasks, where a question is “asked” about each piece of data to generate a labeled dataset.) Surveys can also be run with many agents and models at once to provide different kinds of responses.

Components

As shown in the simple example above, the EDSL package consists of several basic components:

Methods for running surveys

The principle methods for combining survey components and administering them are by() and run(). 

The run() method administers a survey to an LLM. It only requires a single question, as shown in the quick example above: 

results = q.run()  # This will generate results for a single question "q".

The by() method is used optionally to specify scenarios, agents and models to be applied to a survey and precedes the run() method. When we administer a survey with scenarios, agents and models the command will take the following general form: 

results = survey.by(scenarios).by(agents).by(models).run()


Accessing results

Convenient methods for inspecting and working with Results include:

.print() 

.select("agent.*", "answer.*", ...).print()

.filter(<response logic, e.g., "<question_name> =='Yes'") 

.to_pandas().columns

.to_pandas()[["col1", "col2", ...]]

.sql("select * from self", shape="wide")

.word_cloud_plot("answer.<question_name>")

The select() method lets us choose the results that we want to inspect or work with, the filter() method lets us narrow the results based on responses, and the print() method lets us easily show them in a table.

See this notebook for more details on methods for accessing Results.

In the simplest case, we can simulate a response to a question by appending run() to the question name:

question.run() 

If the question takes any specified parameters (called "Scenarios"variations of a question where an input is changed) we append the by() method first:

question.by(scenarios).run() 

If multiple questions have been combined into a survey we can administer them all at once in the same way as a single question:

survey.run() 

If we've created any AI agent personas to use or want to change the LLM we append those instructions as well:

survey.by(scenarios).by(agents).by(models).run() 

where each by() specifies a scenario, agent or LLM or a list of scenarios, agents or LLMs.

Then we can store the Results and use the select(), filter() and print() methods to readily inspect them:

result = question.run()

result.filter("question == 'yes'") 

result.select("question").print()

Tutorials & Demonstrations

In addition to the notebooks linked below, you can also access a workspace of interactive notebooks for EDSL on Deepnote.

General tutorial

This notebook contains a demonstration of how to design, create and administer a survey in edsl

Use cases

This notebook contains some examples of ways to use edsl to conduct qualitative analyses

Topics & examples

Learn how to construct questions, surveys and AI agents in EDSL and explore related topics.

FAQ

Below are some Q&A that we hope will help you get started. Feel free to post new questions at our Discord or send us a message at info@expectedparrot.com and we will get back to you as soon as possible!


What are some ways of exporting my results?

EDSL has a built-in method for using pandas to access your results.

To see all the columns (agent traits, questions, model info, prompts):

import pandas as pd

result.to_pandas().columns 

To select columns:

result.to_pandas()[['agent.<trait_name>', 'answer.<question_name>', ...]]

You can also save to a csv file:

result.to_pandas().to_csv("filename.csv", index=False)

or:

pd.DataFrame(result).to_csv("filename.csv", index=False)


How do I specify the LLM that I want to use?
If an LLM is not specified, questions and surveys are administered using GPT 4. You can specify other LLMs using the by() method. See examples of LLM specification here, including administration of the same survey to multiple LLMs at once: https://examples.expectedparrot.com/edsl_tutorial/#Specifying-the-agent-LLM


How do I seed questions?
By default, questions are administered individually to AI agents. If you want to relate a question to the response for another question you can do this by seeding a question with a parameter that is the response to another question. See an example seeded question here: https://examples.expectedparrot.com/example_survey/#Seeding-questions.


How do I add skip or stop logic?
You can incorporate skip logic by adding a filter on responses to a question. You can add a stop rule using the add_stop_rule() method. See an example survey using a filter and stop logic here: https://examples.expectedparrot.com/example_survey/#Survey-rules 


How do I generate new results for an existing survey?
EDSL automatically caches survey results in your database to avoid unintended costs in rerunning identical questions. If you want to generate new responses to survey questions you can do this by specifying different LLMs or parameters of an LLM, such as the "temperature". See an example of how to modify LLM parameters here: https://examples.expectedparrot.com/edsl_tutorial/#Generating-new-results 


Support & Community

Questions, feedback and feature requests are sincerely appreciated. There are multiple ways to reach us or connect with other users:

Need help building a survey? See some examples or get in touch for live help.

Integrations

Interested in exporting an EDSL survey to another platform, or importing a survey and responses that you've created elsewhere? Get in touch to learn more about our integrations for working with content at other platforms and combining simulated and real responses.