This extractor extracts data from Facebook Graph API. Use Facebook Pages, Facebook Ads, and Instagram extractors to download data from Facebook Graph API.
Table of Contents:
The component supports all Facebook Graph API endpoints available through nested queries:
- feed - page posts including comments and likes
- posts / published_posts - published posts
- insights - page-level metrics
- video / video_reels / videos - video content and metrics
- ratings - page ratings and reviews
- conversations - page inbox conversations
- And any other Graph API edge you can access with your token
The configuration consists of three main parts:
Facebook Graph API version (default: "v23.0"). It is recommended to keep this up to date as Meta deprecates older versions.
Dictionary of accounts from which data will be extracted. Each account contains:
- id: Unique account identifier
- name: Account name
- account_id: Advertising account ID (optional, Facebook Ads only)
- business_name: Business name (optional)
- currency: Currency (optional)
- category: Page category (optional)
- fb_page_id: Facebook page ID linked to an Instagram account (optional, Instagram only)
List of queries for data extraction. Supports two query types: Nested Query and Async Insights Query.
Used for Facebook Pages and Instagram. Fetches data from any Graph API edge.
{
"id": 1,
"name": "query_name",
"type": "nested-query",
"disabled": false,
"query": {
"path": "feed",
"fields": "message,created_time,likes,comments{message,from}",
"ids": "",
"limit": "25",
"since": "30 days ago",
"until": "today"
}
}Parameters:
| Parameter | Description |
|---|---|
path |
Graph API edge to query (e.g., feed, posts, insights). Leave empty to query the account root node. |
fields |
Comma-separated list of fields or a nested Insights DSL expression. |
ids |
Comma-separated account/page IDs to query. If empty, all configured accounts are used. |
limit |
Page size for API pagination (default 25, max 100). |
since |
Start of time range. Accepts time expressions or ISO dates. |
until |
End of time range. Accepts time expressions or ISO dates. |
time-based-pagination |
Set to true to use time-based pagination instead of cursor-based (useful for large date ranges). |
stop-on-empty-response |
Set to true to stop pagination when an empty response is received. |
run-by-id |
Set to true to query each account ID individually instead of in batch. |
For metrics data, Facebook uses a special dot-notation DSL inside the fields parameter. This is supported for both Facebook Pages (insights) and Instagram (insights), as well as post-level metrics (insights on a feed or posts edge).
The general syntax is:
insights.<modifier1>(<value>).<modifier2>(<value>)...
You can optionally append {name, period, values} at the end to control which sub-fields are returned.
Page-level example (Facebook Pages):
insights.since(30 days ago).time_increment(1).metric(page_impressions, page_views_total, page_fans)
Post-level example (on feed path):
insights.metric(post_impressions, post_clicks).period(lifetime){name, values}
With breakdown (new Meta API format):
insights.metric(post_media_view).breakdown(is_from_ads).period(lifetime)
Instagram example:
insights.since(30 days ago).period(day).metric_type(total_value).breakdown(media_product_type).metric(reach, total_interactions)
Mixed fields and insights (on feed path):
message, created_time, permalink_url, insights.metric(post_impressions).period(lifetime){name, values}
Specifies which metrics to retrieve. Multiple metrics are comma-separated.
- Where to find valid metric names:
- Facebook Pages: Page Insights metrics reference
- Deprecated Page metrics (replaced Nov 2025): Deprecated metrics
- Instagram: Instagram Insights metrics
- Post-level metrics: Post Insights
insights.metric(page_impressions, page_views_total, page_fans)
insights.metric(post_clicks, post_impressions_unique)
Note on deprecated metrics: As of November 15, 2025, Meta replaced metrics like
post_impressions_fan,page_impressions_paidwith new metrics that require explicitbreakdownparameters. For example,post_impressions_fan→post_media_viewwithbreakdown(is_from_followers). See breakdown below.
Aggregation period for the metric values.
| Value | Description |
|---|---|
day |
Daily aggregation |
week |
Weekly aggregation |
month / days_28 |
Monthly aggregation |
lifetime |
Entire lifetime of the object (common for post-level insights) |
insights.metric(page_impressions).period(day)
insights.metric(post_impressions).period(lifetime)
Splits metric values by a dimension. The breakdown value is returned in the key1/key2 output columns.
Common breakdown values for Facebook Pages:
| Breakdown | Description |
|---|---|
is_from_ads |
Whether the impression came from an ad (0 or 1) |
is_from_followers |
Whether the viewer is a follower (0 or 1) |
action_type |
Type of action performed |
Common breakdown values for Instagram:
| Breakdown | Description |
|---|---|
country |
Country of the audience |
city |
City of the audience |
age |
Age group |
gender |
Gender |
media_product_type |
Type of media (POST, REEL, STORY, etc.) |
Multiple breakdowns (up to 2 are captured in key1/key2, sorted alphabetically):
insights.metric(post_media_view).breakdown(is_from_ads, is_from_followers).period(lifetime)
Important:
breakdownhere is the Pages/Instagram DSL keyword. It is different from thebreakdowns=URL parameter used in Async Insights Queries for Facebook Ads. Do not confuse them.
Time range filter within the DSL. Accepts the same time expressions as the top-level since/until query parameters.
insights.since(30 days ago).until(today).metric(page_impressions)
insights.since(now).metric(post_impressions_paid)
Tip:
since(now)is a common pattern meaning "fetch from the most recent data point". The actual date range is controlled by the top-levelsince/untilquery parameters.
Controls the time granularity of the returned data points. Value is in days or a period keyword.
insights.metric(page_impressions).time_increment(1) # daily data points
insights.metric(page_impressions).time_increment(monthly)
Renames the output metric. Useful when using the same metric with different breakdowns in separate queries.
insights.metric(post_media_view).breakdown(is_from_ads).as(post_media_view_from_ads).period(lifetime)
Used for Instagram demographic insights instead of since/until.
| Value | Description |
|---|---|
this_month |
Current month |
last_month |
Previous month |
this_year |
Current year |
insights.metric(engaged_audience_demographics).period(lifetime).timeframe(this_month).metric_type(total_value).breakdown(country)
Controls how metric values are aggregated.
| Value | Description |
|---|---|
total_value |
Sum of all values |
time_series |
Values over time |
insights.period(day).metric_type(total_value).breakdown(media_product_type).metric(reach, total_interactions)
Both the top-level since/until parameters and the DSL since()/until() modifiers accept flexible time expressions:
| Expression | Meaning |
|---|---|
today |
Today's date |
yesterday |
Yesterday |
now |
Current timestamp (often used in DSL to mean "most recent") |
30 days ago |
30 days before today |
3 day ago |
3 days before today (singular also accepted) |
this month |
Start of the current month |
2025-01-01 |
Absolute ISO date |
last_7d |
Last 7 days — Ads API date_preset format only |
last_90d |
Last 90 days — Ads API date_preset format only |
The path parameter specifies which Graph API edge to traverse for each account. Common values:
| Path | Description | Component |
|---|---|---|
| (empty) | Account root node | Pages, Instagram, Ads |
feed |
Published and scheduled posts | Pages |
posts |
Published posts | Pages |
published_posts |
All published posts (alternative to posts) |
Pages |
insights |
Page-level insights directly | Pages |
video |
Videos | Pages |
video_reels |
Reels | Pages |
videos |
All video objects | Pages |
ratings |
Page ratings and reviews | Pages |
conversations |
Inbox conversations | Pages |
me/posts |
Posts by the authenticated user | Pages |
ads-action-stats |
Ad action statistics | Ads |
You can also use absolute Graph API URLs as the path:
"path": "https://graph.facebook.com/v3.3/SomePage/feed"Used exclusively for Facebook Ads. Submits an asynchronous job to the Marketing API and waits for results. Required for large ad account reports where synchronous requests would time out.
{
"id": 1,
"name": "campaign_performance",
"type": "async-insights-query",
"disabled": false,
"query": {
"ids": "act_123456789",
"parameters": "level=campaign&fields=campaign_id,campaign_name,impressions,clicks,spend&date_preset=last_7d&time_increment=1&action_breakdowns=action_type&breakdowns=impression_device",
"limit": "100"
}
}Parameters:
| Parameter | Description |
|---|---|
ids |
Comma-separated ad account IDs (must start with act_). |
parameters |
URL query string formatted according to the Marketing API Insights spec. |
limit |
Page size for result pagination. |
since / until |
Optional date range override. |
split-query-time-range-by-day |
Set to true to split the query into daily requests (useful for large accounts hitting API limits). |
Key parameters options:
| Option | Description |
|---|---|
level |
Aggregation level: account, campaign, adset, ad |
fields |
Comma-separated metric/dimension fields |
date_preset |
Shorthand date range: last_7d, last_30d, last_90d, last_month, this_month, etc. |
time_increment |
Granularity in days, or monthly |
action_breakdowns |
Break down action metrics by: action_type, action_device, action_reaction, etc. |
breakdowns |
Break down impressions/clicks by: impression_device, publisher_platform, product_id, etc. |
action_attribution_windows |
Attribution windows: 1d_click, 7d_click, 1d_view, 28d_click, etc. |
filtering |
JSON array of filters, e.g. [{"field":"action_type","operator":"IN","value":["purchase"]}] |
use_account_attribution_setting |
Use the account's default attribution setting (true/false) |
use_unified_attribution_setting |
Use unified attribution setting (true/false) |
The parameters field also supports the DSL format:
insights.level(campaign).action_attribution_windows(7d_click).action_breakdowns(action_type).date_preset(last_7d).time_increment(1){campaign_name,campaign_id,actions,action_values}
Where to find valid fields and breakdowns: Use the Marketing API Insights reference and the Graph API Explorer to discover available fields for your account.
The component generates the following tables:
Contains information about configured accounts with columns:
id,name,account_id,business_name,currency,category
For each query, one or more tables are created based on the data structure:
{query_name}— top-level fields{query_name}_{nested_edge}— nested objects (e.g.,feed_comments,feed_likes)
Insights data is flattened into rows with the following columns:
| Column | Description |
|---|---|
id |
Object ID |
parent_id |
Parent object ID |
ex_account_id |
Account ID the data was fetched for |
fb_graph_node |
Describes position in the hierarchy (e.g., page_feed_insights) |
key1 |
First breakdown dimension value |
key2 |
Second breakdown dimension value |
value |
Metric value |
end_time |
Timestamp of the data point |
name |
Metric name |
title |
Metric title |
description |
Metric description |
period |
Aggregation period |
Note on
key1/key2: For newer Meta metrics usingbreakdown(...), the breakdown dimension value (e.g.,"0","1") is placed inkey1/key2. If two breakdowns are used, they are sorted alphabetically and mapped tokey1andkey2respectively. For older nested metrics likepost_reactions_by_type_total, the reaction type name (e.g.,"like","love") is placed inkey1.
By default, data is stored in a bucket with the format in.c-{component-id}-{configuration-id}.
If the configuration contains a bucket-id parameter, data will be stored in the specified bucket instead. This is useful for:
- Migration from legacy components: Redirect data to the original bucket used by previous extractors
- Custom bucket organization: Create your own bucket structure
{
"parameters": {
"bucket-id": "in.c-my-custom-facebook-bucket",
"api-version": "v23.0",
"accounts": {},
"queries": []
}
}To customize the local data folder path, replace the CUSTOM_FOLDER placeholder with your desired path in the docker-compose.yml file:
volumes:
- ./:/code
- ./CUSTOM_FOLDER:/data
Clone this repository and run the component using the following commands:
git clone https://github.com/keboola/component-meta component-meta
cd component-meta
docker-compose build
docker-compose run --rm dev
Run the test suite and perform lint checks using this command:
docker-compose run --rm test
The project uses uv for Python dependency management. Dependencies are defined in pyproject.toml.
src/component.py- Main component logicsrc/client.py- Facebook Graph API clientsrc/configuration.py- Configuration model definitionssrc/output_parser.py- Data parsing and transformationsrc/page_loader.py- API page loading