Sources & Training

This section describes the Sources tab in your AI Assistant / Chatbot.

Sources

Sources are your AI Assistant's data. Your AI can have many different sources of data, including your website, files and folders, plain text and more.

Website & URLs

How to use your website's data in your AI Assistants.

Crawling

One convenient and fast way to ingest data is to let us crawl your website. Just enter your website url and we fetch all your pages and retrieve the data for your AI Assistant.

By default, if you enter your website's root URL, i.e. https://my-site.com/, we will start looking for ALL the pages in your website, starting at the root level.

You can also restrict the crawler to only fetch pages at a given "level" or "base url", for example https://my-site-com/blog/ will only fetch pages that has the base url you provided, in this case, only pages that are in the /blog/ area of your website. With this functionality, you have more control over which pages you want our crawler to fetch.

Recrawling existing links

To recrawl your links, simply use the dropdown menu and choose between recrawling selected links, or recrawl all links. Use this functionality if you know you have some updated content. This will not crawl for new urls, but will fetch updated content on your already added url sources.

Files

Another source of data you can give your AI Assistants is uploading files from your computer. When you upload files, we will fetch all the text in the file and use this as a data source for your AI Assistant.

We currently support four different types of files:

  • Microsoft Word (.docx)
  • PDF (.pdf)
    Please make sure your PDF files have selectable text, or else we will not be able to read them.
  • Text (.txt)
  • Markdown (.md)

Plain text & Markdown

Another option to ingest data is to simply copy & paste or just type your text into our markdown editor. You can use plain text or markdown.

Training your AI Assistant

After you have added (or updated) your sources, you need to train your AI Assistant on these sources. Your sources will not be used by your AI until you train it to do so.

Under the Sources tab in your AI Assistant, you will find the button Train assistant. When you click this button, you start the process of training your assistant on all the sources it is not yet trained upon. Depending on the amount of sources and data, this process may take some time to complete.

After your AI is finished training, you can start asking it questions and it will reply with answers based on your own data. AI answers may vary from time to time and on the quality of the sources you provided.

If you upload a file, add a url or update a source that already exists, you only have to retrain your AI Assistant if the source data has any changes. You can see which sources are not yet trained in the Sources tab.