Skip to main content
Blog

How to scrape TikTok posts and video data

Use a tiktok posts scraper to extract video URLs, engagement metrics, and profile fields from public TikTok post links.

ScrapersTikTokMay 23, 2026
How to scrape TikTok posts and video data

TikTok video pages contain captions, engagement counts, video URLs, music metadata, and publish timestamps. The ScrapeNow TikTok posts scraper extracts all of these and returns structured JSON per video.

Use the URL scraper when you already have a list of TikTok post links. Use the search scrapers when your job starts with a keyword or creator profile.

How to use this scraper

TikTok Posts Scraper job pipeline
The TikTok Posts Scraper job pipeline, from input to stored output.

The direct URL scraper is the shortest path when you already have TikTok video URLs. Use Extract TikTok post data when your input is a list of post URLs. Use Search TikTok posts by keyword when ScrapeNow needs to find posts from a search term.

Step 1. Get the input values

For the TikTok Posts Extract by URL scraper, the required input is:

  • url, URL to a TikTok video. It must start with https://www.tiktok.com/.
  • country, Optional. Send a two-letter ISO 3166-1 country code as a string, such as US.
TikTok Videos tab showing Taylor Swift search result thumbnails grid
Search TikTok and open the video whose URL you need
Open `tiktok.com`, open the target video, then copy the URL from the address bar. Use the browser URL instead of a copied share link when possible. Browser URLs usually contain the creator handle and `/video/{post_id}`, which makes validation easier.
TikTok taylorswift concert video with URL highlighted in address bar
TikTok video page with URL in the browser address bar
A valid input object looks like this:
{
  "url": "https://www.tiktok.com/@taylorswift/video/7582623625484848415",
  "country": "US"
}

For keyword search, use these inputs:

  • keyword, Search term to send to TikTok, such as coldplay.
  • max_posts, Maximum number of posts to scrape. Send this as an integer.
  • country, Optional. Send a two-letter ISO 3166-1 country code as a string, such as US.
TikTok search box showing coldplay keyword autocomplete suggestions dropdown
Type a keyword into TikTok search to find posts
For profile-based scraping, use [Pull posts from a TikTok profile](/scraper/tiktok-posts-search-by-profile-url) when you need posts from one creator profile. This scraper works well for creator monitoring, campaign checks, and backfilling a creator’s recent posts.
TikTok coldplay search results showing Bruno Mars FP creator profile
Open a TikTok creator profile to collect their posts
Open the profile on TikTok and copy the profile URL.
TikTok Users tab highlighting verified coldplay account with 8M followers
Open the TikTok creator profile and copy the URL from the address bar
Paste the profile URL into the scraper input.
TikTok coldplay profile page with profile URL in address bar
Profile URL pasted into the scraper input form
If you need pinned posts included, enable that optional setting. Pinned posts can be old, so keep them separate from recent-post analysis if post recency matters.
TikTok coldplay profile with Latest, Popular, Oldest sort tabs highlighted
Toggle the include pinned posts option to capture older pinned content

Step 2. Run the API code

This example runs the TikTok Posts Extract by URL scraper, polls the job every 5 seconds, waits up to 3600 seconds, then saves the JSON file locally.

The script also creates an output directory before writing the file. Keep that in production code. Failed file writes create avoidable reruns.

"""
Configuration:
    - Set SCRAPER_SLUG to the scraper you want to run.
    - Set SCRAPER_INPUTS to the list of input dicts matching that scraper's schema.
    - Set API_KEY to your ScrapeNow API key.
"""

import sys
import time
import json
import requests
import os

API_KEY = "YOUR_API_KEY"

SCRAPER_SLUG = "tiktok-posts-extract-by-url"

SCRAPER_INPUTS = [
    {
        "url": "https://www.tiktok.com/@taylorswift/video/7582623625484848415",
        "country": "US"
    }
]

BASE_URL = "https://api.scrapenow.io/api/v1/scraping"
TIMEOUT_SECONDS = 3600
POLL_INTERVAL = 5
SPINNER = "|/-\\"


def build_headers(api_key: str, content_type: str | None = None) -> dict:
    headers = {"Authorization": f"Bearer {api_key}"}
    if content_type:
        headers["Content-Type"] = content_type
    return headers


def trigger_scrape(slug: str, inputs: list[dict]) -> str:
    url = f"{BASE_URL}/scrape?scraper={slug}"
    response = requests.post(
        url,
        headers=build_headers(API_KEY, "application/json"),
        json={"inputs": inputs},
    )
    response.raise_for_status()
    return response.json()["data"]["job_id"]


def poll_until_done(job_id: str) -> str:
    start = time.time()
    i = 0
    while True:
        elapsed = time.time() - start
        if elapsed > TIMEOUT_SECONDS:
            print(f"\nTimeout after {TIMEOUT_SECONDS}s")
            sys.exit(1)
        response = requests.get(
            f"{BASE_URL}/jobs/{job_id}",
            headers=build_headers(API_KEY),
        )
        response.raise_for_status()
        data = response.json()
        status = data["data"]["status"]
        mins, secs = divmod(int(elapsed), 60)
        sys.stdout.write(
            f"\r[{SPINNER[i % 4]}] Waiting... {status} ({mins}m {secs:02d}s)  "
        )
        sys.stdout.flush()
        if status in ("completed", "failed"):
            print()
            return status
        time.sleep(POLL_INTERVAL)
        i += 1


def fetch_results(job_id: str) -> dict:
    response = requests.get(
        f"{BASE_URL}/jobs/{job_id}/results?format=json",
        headers=build_headers(API_KEY),
    )
    response.raise_for_status()
    return response.json()


def save_results(data: dict, slug: str) -> str:
    os.makedirs("output", exist_ok=True)
    filename = os.path.join("output", f"{slug}.json")
    with open(filename, "w", encoding="utf-8") as f:
        json.dump(data, f, indent=2, ensure_ascii=False)
    return filename


def main() -> None:
    print(f"Triggering scraper: {SCRAPER_SLUG}")
    job_id = trigger_scrape(SCRAPER_SLUG, SCRAPER_INPUTS)
    print(f"Job started: {job_id}")
    final_status = poll_until_done(job_id)
    if final_status != "completed":
        print(f"Job failed with status: {final_status}")
        sys.exit(1)
    print("Fetching results...")
    results = fetch_results(job_id)
    output_file = save_results(results, SCRAPER_SLUG)
    print(f"Results saved to: {output_file}")


if __name__ == "__main__":
    main()

The same API pattern works for the other scrapers in this group. Use Search TikTok posts by keyword for keyword discovery. Use Pull posts from a TikTok profile for creator feeds.

Change the scraper slug and input values in the code for each scraper. Keep the polling, result download, and file write logic unchanged.

Step 3. Read the response

A completed job returns an array of post records. This trimmed response shows the fields you should expect from a successful URL scrape.

[
  {
    "inputs": {
      "url": "https://www.tiktok.com/@taylorswift/video/7582623625484848415",
      "country": "US"
    },
    "scrape_status": "success",
    "url": "https://www.tiktok.com/@taylorswift/video/7582623625484848415",
    "post_id": "7582623625484848415",
    "description": "There truly was magic in the *eras*✨ I can’t wait for you guys to see the first two episodes of The End of an Era and relive The Eras Tour | The Final Show TOMORROW on @Disney+ starting 12am PT / 3am ET ",
    "create_time": "2025-12-11T15:32:37.000Z",
    "digg_count": 354100,
    "share_count": "10800",
    "collect_count": 16326,
    "comment_count": 13200,
    "play_count": 5800000,
    "video_duration": 30,
    "original_sound": "Taylor Swift: original sound",
    "profile_id": "6881290705605477381",
    "profile_username": "Taylor Swift",
    "profile_url": "https://www.tiktok.com/@taylorswift",
    "profile_avatar": "https://p19-common-sign.tiktokcdn-us.com/tos-useast8-avt-0068-tx2/701431b0215286dd09f201e25b483662~tplv-tiktokx-cropcenter:1080:1080.jpeg?dr=9640&refresh_token=384e9292&x-expires=1778932800&x-signature=YqegqQHrSe3uniCWtuZ7NNzMJvo%3D&t=4d5b0474&ps=13740610&shp=a5d48078&shcp=81f88b70&idc=useast8",
    "profile_biography": "This is pretty much just a cat account",
    "preview_image": "https://p19-common-sign.tiktokcdn-us.com/tos-useast8-p-0068-tx2/ocBEQCDmFAXUOy1DARiyACTEC4vIQVAfApubfE~tplv-tiktokx-origin.image?dr=9636&x-expires=1778932800&x-signature=Q5uUG4nQSchC5fVMqLQGwP5pDI4%3D&t=4d5b0474&ps=13740610&shp=81f88b70&shcp=43f4a2f9&idc=useast8",
    "post_type": "video",
    "offical_item": false,
    "secu_id": "MS4wLjABAAAAqB08cUbXaDWqbD6MCga2RbGTuhfO2EsHayBYx08NDrN7IE3jQuRDNNN6YwyfH6_6",
    "original_item": false,
    "shortcode": "7582623625484848415",
    "width": 720,
    "ratio": "720p",
    "video_url": "https://v16-webapp-prime.us.tiktok.com/video/tos/useast8/tos-useast8-pve-0068-tx2/oY5peL9IIEUIApdDc6egGEFAjyeVSGqaIQoCCc/?a=1988&bti=ODszNWYuMDE6&&bt=1305&ft=4KJMyMzm8Zmo0iZuFa4jVgQOQpWrKsd.&mime_type=video_mp4&rc=ODszNDc5PGQ3MzQ1aGU8ZEBpM3hucXQ5cjNoNzMzaTczNEBeLS01NC40NWMxYzBiLTY2YSMwMG1jMmRrNXFhLS1kMTJzcw%3D%3D&expire=1778933445&l=20260514121015CAA19F4496299602DF64&ply_type=2&policy=2&signature=b3c739fbb8fbf029868272bc23df770c&tk=tt_chain_token&btag=e00088000",
    "music": {
      "authorname": "Taylor Swift",
      "covermedium": "https://p16-common-sign.tiktokcdn-us.com/tos-useast8-avt-0068-tx2/701431b0215286dd09f201e25b483662~tplv-tiktokx-cropcenter:720:720.jpeg?dr=9640&refresh_token=ce2c100a&x-expires=1778932800&x-signature=rpt4Cp0zLMhn1W5s4XnWrrCP100%3D&t=4d5b0474&ps=13740610&shp=a5d48078&shcp=81f88b70&idc=useast8",
      "id": "7582623461881678622",
      "original": true,
      "playurl": "https://v19.tiktokcdn-us.com/ba8eab38e77ecf24c188b76311ec7c13/6a061025/video/tos/useast8/tos-useast8-v-27dcd7-tx2/oIbQIVwqXDCCyvAfAuFvEJpABUKAjB2OIBfJ1E/?a=1233&bti=ODszNWYuMDE6&&bt=125&ft=GSDrKInz7Thdz2pGXq8Zmo&mime_type=audio_mpeg&rc=... (truncated)"
    }
  }
]

Treat direct media URLs as time-sensitive. TikTok asset URLs often contain signatures and expiration parameters. Store them for short-term processing, then refresh the post if your downloader receives an expired asset response.

What data you get back

TikTok Posts Scraper output schema
TikTok Posts Scraper output fields grouped by category.

The scraper returns one record per input URL. If you submit 100 URLs, expect up to 100 result objects. Each object includes its original input and scrape status.

Field Type Use it for
scrape_status string Filter successful and failed rows
post_id string Primary key for deduplication
url string Canonical post URL
description string Caption text and hashtags
create_time string Post timestamp in ISO format
digg_count integer Likes
share_count string Shares
collect_count integer Saves or collections
comment_count integer Comment volume
play_count integer Views
video_duration integer Duration in seconds
original_sound string Display name for the audio
profile_id string Creator ID
profile_username string Creator display username
profile_url string Creator profile URL
preview_image string Thumbnail URL
video_url string Direct video asset URL when available
music object Nested audio metadata

Ready to get this data? Extract TikTok post data.

Use post_id as your stable post key. TikTok URLs vary by tracking parameters, copied share links, and browser state. The post ID stays fixed across those variants.

Use profile_id when joining post rows to creator rows from Get TikTok profile data. The profile URL works for human review. The ID works better for database joins.

Use scrape_status before loading rows into production tables. Send failed rows to a retry queue with the original inputs object intact. That object gives you the exact URL and country used on the failed attempt.

Keep description as raw text. Captions can include emojis, hashtags, mentions, line breaks, and campaign tags. Parse hashtags into a separate table if you query them often.

Store music as JSON. Audio metadata changes less often than engagement counts, but the nested shape gives you fields that do not belong in the main post table.

Production tips for clean TikTok post data

TikTok Posts Scraper input normalization
How the TikTok Posts Scraper normalizes inputs before deduplication.

Validate URLs before sending jobs

Reject invalid URLs before you spend credits.

from urllib.parse import urlparse

def validate_tiktok_video_input(item: dict) -> dict:
    url = item.get("url", "")
    country = item.get("country", "US")

    parsed = urlparse(url)

    if parsed.scheme != "https":
        raise ValueError(f"Invalid scheme: {url}")

    if parsed.netloc != "www.tiktok.com":
        raise ValueError(f"Invalid host: {url}")

    if "/video/" not in parsed.path:
        raise ValueError(f"URL is not a TikTok video URL: {url}")

    if not isinstance(country, str) or len(country) != 2:
        raise ValueError(f"Country must be a 2-letter code: {country}")

    return {
        "url": url,
        "country": country.upper()
    }


inputs = [
    {"url": "https://www.tiktok.com/@taylorswift/video/7582623625484848415", "country": "us"}
]

clean_inputs = [validate_tiktok_video_input(item) for item in inputs]
print(clean_inputs)

Expected output:

[
  {
    "url": "https://www.tiktok.com/@taylorswift/video/7582623625484848415",
    "country": "US"
  }
]

Resolve TikTok short links before validation if your input source contains copied mobile share URLs.

Deduplicate on post_id

Use post_id as the primary key. Extract the ID from /video/{id} in the input URL to dedupe before scraping.

import re

def extract_post_id(url: str) -> str | None:
    match = re.search(r"/video/(\d+)", url)
    return match.group(1) if match else None


def dedupe_inputs(inputs: list[dict]) -> list[dict]:
    seen = set()
    deduped = []

    for item in inputs:
        post_id = extract_post_id(item["url"])
        key = post_id or item["url"]

        if key in seen:
            continue

        seen.add(key)
        deduped.append(item)

    return deduped


raw_inputs = [
    {"url": "https://www.tiktok.com/@taylorswift/video/7582623625484848415", "country": "US"},
    {"url": "https://www.tiktok.com/@taylorswift/video/7582623625484848415", "country": "US"}
]

print(dedupe_inputs(raw_inputs))

Expected output:

[
  {
    "url": "https://www.tiktok.com/@taylorswift/video/7582623625484848415",
    "country": "US"
  }
]

If you dedupe across countries, include country in your key when regional access matters.

Keep a typed schema

Counts can arrive as integers or strings. Normalize before loading into your warehouse.

COUNT_FIELDS = [
    "digg_count",
    "share_count",
    "collect_count",
    "comment_count",
    "play_count"
]

def to_int(value):
    if value is None or value == "":
        return None
    return int(value)


def normalize_post(row: dict) -> dict:
    normalized = dict(row)

    for field in COUNT_FIELDS:
        normalized[field] = to_int(row.get(field))

    normalized["post_id"] = str(row["post_id"])
    normalized["profile_id"] = str(row.get("profile_id", ""))
    normalized["video_duration"] = to_int(row.get("video_duration"))

    return normalized

Store the full response JSON in a raw_payload column and map the fields you query often into typed columns.

CREATE TABLE tiktok_posts (
  post_id TEXT PRIMARY KEY,
  url TEXT NOT NULL,
  description TEXT,
  create_time TIMESTAMP,
  digg_count INTEGER,
  share_count INTEGER,
  collect_count INTEGER,
  comment_count INTEGER,
  play_count INTEGER,
  video_duration INTEGER,
  original_sound TEXT,
  profile_id TEXT,
  profile_username TEXT,
  profile_url TEXT,
  preview_image TEXT,
  video_url TEXT,
  scrape_status TEXT NOT NULL,
  raw_payload JSONB,
  scraped_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Write engagement snapshots to a separate table so overwriting the main row does not lose trend data.

CREATE TABLE tiktok_post_metrics (
  post_id TEXT NOT NULL,
  digg_count INTEGER,
  share_count INTEGER,
  collect_count INTEGER,
  comment_count INTEGER,
  play_count INTEGER,
  scraped_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (post_id, scraped_at)
);

Separate failed rows from empty fields

Use scrape_status and post_id together. A row with missing video_url can still be usable. A row without post_id needs review or retry.

def split_success_and_failures(results: list[dict]) -> tuple[list[dict], list[dict]]:
    successes = []
    failures = []

    for row in results:
        if row.get("scrape_status") == "success" and row.get("post_id"):
            successes.append(row)
        else:
            failures.append({
                "inputs": row.get("inputs"),
                "scrape_status": row.get("scrape_status"),
                "url": row.get("url")
            })

    return successes, failures

Cap retries at 3 attempts per URL. Track retry counts in your queue or job table, and keep scrape results immutable.

Refresh metrics on a schedule

Scrape new campaign posts more often during the first day, then reduce frequency once growth slows. After repeated failures, mark the post as unavailable and stop scheduling it.

Treat media URLs as temporary

video_url, preview_image, and music playurl fields contain signed CDN URLs. Download the asset soon after the scrape if you need durable media storage.

Pick the right TikTok post scraper

Use the URL scraper when you already know the posts. Use the search scrapers when discovery is part of the job.

Scraper Input Best fit
Extract TikTok post data TikTok video URL Hydrating known post URLs with metrics and metadata
Search TikTok posts by keyword Keyword, max posts, country Finding posts that match a search term
Pull posts from a TikTok profile Profile URL Collecting posts from one creator
Extract TikTok comments TikTok post URL Pulling comment rows after you identify posts

For campaign monitoring, start with profile or keyword search. Then feed the returned post URLs into the URL scraper for refreshes. That pattern separates discovery from metric updates.

For known post lists, skip search. Send the URLs directly and store the result by post_id.

If you need comments after collecting posts, the TikTok comments scraper covers the comment-level response and API pattern. For the full list of TikTok and other platform scrapers, use the Browse all 86+ scrapers.

Pricing

ScrapeNow charges per returned row. One row costs one credit, starting at $0.04 per credit for small runs and dropping with volume. No monthly contracts, no proxy fees, no charges for failed rows. See the pricing page for current rates.

Start with Extract TikTok post data if you have video URLs ready. Copy the Python script above, replace YOUR_API_KEY, and run one URL.

After the first job completes, check scrape_status, post_id, and the count fields before loading more URLs. Then switch the slug to the TikTok Posts Search by Keyword scraper or the TikTok Posts Search by Profile URL scraper when you need discovery.

Related articles

View all

Start collecting data in under five minutes.

Free credits included - no credit card required.

Free credits included - no credit card required