Sources Conf

Sources (or connectors) are used to import data from external API (Google Analytics, Facebook, etc) or databases (redis, firebase, etc) into destinations. Each source represents a connection to a particular API.

The synchronization scheduling engine is called sync tasks.

Luden supports 3 types of sources:

  • Native source (Example: Google Ads, Facebook) 🚀

  • Singer based source. Singer as a collection of ETL-connectors written in Python. Singer-based source are not part of Luden codebase. Luden just run the python package, processes output and saves data to a destination. (NOT RECOMMEND!)

  • Airbyte based sources. Airbyte as an ETL-framewark similar to Singer. Airbyte sources are distributed as docker images. Luden pull those images, run theme and puts output to a database. 🚀

Collection Configuration

Sources should define a list of collections (or streams) explicitly. Each collection defines a synchronization schedule, destination table name (table name will be prefixed with source_id to avoid collisions). Here's an example configuration snippet:

sources:
  firebase_example_id:
  collections:
    - name: "some_name"
      type: "collection_type_id"
      table_name: "table_name_for_data"
      start_date: "2020-06-01"
      schedule: '@daily' #cron expression. see below
      parameters:
        field1: "value"
        field2: ["values"]
        field3:
          some_object:
      ...

Last updated