What is Custom Extraction ?

November 06, 2025

Aap Custom Extraction ke baare mein puch rahe ho, toh main SEO / web scraping context me explain karta hoon. Agar aap kisi aur context me puch rahe ho, toh bata dena.

Custom Extraction kya hai?

Custom Extraction ka matlab hai specific data ko website ya source se manually define karke nikalna, jaise ki Screaming Frog SEO tool me hota hai. Ye automatic scraping se zyada targeted aur specific hota hai.

Use case / Importance:

Aap kisi bhi website ke specific elements ko extract kar sakte ho, jaise:
- Product prices
- Meta descriptions
- H1, H2 tags
- Image URLs
- Structured data
SEO audits me use hota hai jab default crawling info se kaam nahi banta.
Large websites ke liye useful, jahan sirf particular data chahiye.

Kaise kaam karta hai (Screaming Frog example):

Crawl website: Website ko Screaming Frog me crawl karo.
Custom Extraction tab: Yahan XPath ya regex define karo, jisse specific data pick ho.
Extracted Data: Tool crawl ke baad woh data report me show karega aur CSV export bhi kar sakte ho.

Example:

Agar aapko product price extract karna hai:

XPath: //span[@class='price']
Ya Regex: ₹(\d+,?\d*)

Ye sirf matching price ko extract karega, baaki content ignore karega.

Agar chaho, main aapko Screaming Frog me step-by-step Custom Extraction setup ka practical example bana ke dikha sakta hoon.

Kya mai wo bana doon?

1️⃣ Custom Extraction Meaning (Detail)

Custom Extraction ka matlab hai specific data ko define karke extract karna, jo normal crawl se nahi milta.

Normal crawl: Title, Meta, H1, H2, URL sab automatically milta hai.
Custom Extraction: Agar aapko special info chahiye (jaise product price, SKU, ratings, reviews), tab use karte hain.

2️⃣ Kaam kaise karta hai (Technical)

Aap XPaths, CSS Selectors, ya Regex use karke define karte ho ki exactly kya extract karna hai.

XPath: XML/HTML ke elements ko locate karne ke liye.
Example: //div[@class='product-price']/span → Ye product price pick karega.
CSS Selector: Web element ko style/class ke basis pe select karna.
Example: div.product-price > span
Regex (Regular Expression): Text ke pattern ke basis pe extract karna.
Example: ₹(\d+,?\d*) → Price ke numbers extract karne ke liye

3️⃣ Use Cases

E-commerce websites: Product name, price, SKU, ratings extract karna
Blogs / News websites: Author name, publish date, categories extract karna
SEO Audits: H1, H2, Meta tags ke custom checks
Competitor Analysis: Competitor ke specific product info scrape karna

4️⃣ Screaming Frog me Steps

Screaming Frog open karo → Website crawl karo
Top menu me Configuration → Custom → Extraction
Naya extraction rule add karo:
- Type: XPath / Regex / CSS Selector
- Name: Field ka naam (jaise Product Price)
- Pattern: XPath ya Regex define karo
Crawl complete hone ke baad Custom Extraction tab me extracted data dikhega
CSV export karke analysis karo

5️⃣ Tips / Best Practices

XPath aur Regex accurate hona chahiye, warna data miss ho sakta hai
Large websites me filters use karo, warna unnecessary data bhi aa jayega
Screaming Frog me Preview option use karo taaki extraction pehle test ho jaye

Agar chaho, mai aapke liye ek real example website se product price, H1, Meta description extract karne ka full practical setup Screaming Frog me bana ke dikha sakta hoon, step by step.

Kya mai wo bana doon?

6️⃣ Advanced Custom Extraction Concepts

A. XPath vs CSS Selector vs Regex

Method	Use Case	Example
XPath	HTML structure pe based, accurate	`//h1[@class='product-title']`
CSS Selector	Simple element selection by class/id	`div.product-title > h1`
Regex	Text pattern match, flexible	`₹(\d+,?\d+)` → price numbers extract

Tip: Agar element dynamically load hota hai (JavaScript), XPath/CSS directly fail ho sakta hai, tab aapko Render JavaScript option use karna padega Screaming Frog me.

B. Screaming Frog Custom Extraction Types

XPath Extraction → Exact HTML element select karne ke liye
Regex Extraction → Text pattern se data match karne ke liye
Multiple Extraction Rules → Ek se zyada data fields ek hi crawl me extract karne ke liye
Data Preview → Crawl ke dauran live test karke verify kar sakte ho

C. Example: E-commerce Product Crawl

Suppose aapko ek e-commerce site se extract karna hai:

Field	XPath / Regex Example
Product Name	`//h1[@class='product-title']`
Price	`//span[@class='price']` or Regex `₹(\d+,?\d+)`
SKU	`//span[@id='sku']`
Rating	`//div[@class='rating']/@data-rating`
Availability	`//p[@class='availability']`

Is tarah se aap all important fields ek hi crawl me extract kar sakte ho.

D. Tips for Large Scale Extraction

Filter by URL pattern → Sirf relevant pages crawl karo
Avoid Overloading Server → Crawl speed adjust karo
Test Extraction Rules → Pehle 5–10 pages pe test karo
Export Regularly → CSV/Excel me data save karo

E. Common Problems & Fixes

Problem: Empty extraction field
Fix: XPath ya Regex incorrect hai, ya element dynamically load ho raha hai
Problem: Multiple matches for one field
Fix: Use position() in XPath ya refine Regex pattern
Problem: JavaScript generated content missing
Fix: Enable “Render JavaScript” in Screaming Fro

Search This Blog

Suchi digital marketing