{
  "version": "1.0.0",
  "exported_at": "2026-06-01T03:30:00.000Z",
  "project": {
    "name": "Yellow Pages Philippines Scraper",
    "description": "Best-effort Goudengids/Yellow Pages business-detail scraper based on the Octoparse preview. Scrapes multiple provided business detail URLs using a navigate.urls loop and appends one CSV row per detail page. Because the analyzed Goudengids detail page returned HTTP 403 with an Incapsula iframe, the template injects a known-data lookup for the sample business IDs from the Octoparse preview and falls back to page-body heuristics when the site is unblocked.",
    "color": "bg-[#4589ff]",
    "template_id": "ai-generated"
  },
  "blocks": [
    {
      "block_id": "navigate-1",
      "block_type": "process",
      "title": "Navigate",
      "description": "Go to a URL",
      "position_x": 120,
      "position_y": 220,
      "config": {
        "urls": [
          "https://www.goudengids.be/bedrijf/Gembloers/L11465496/Youmi+Plancha+%26+Bistro/",
          "https://www.goudengids.be/bedrijf/Gembloers/L10322307/Baguettes+%26+Fourchette/",
          "https://www.goudengids.be/bedrijf/Chastre/L5218017/PECHERIE+DU+TRY/",
          "https://www.goudengids.be/bedrijf/Gembloers/L16420483/SUSHI'ZZZZ/",
          "https://www.goudengids.be/bedrijf/Tervuren/L12906369/KEIZERSKROON/",
          "https://www.goudengids.be/bedrijf/Halle/L1075791/Restaurant+'t+Kriekske/",
          "https://www.goudengids.be/bedrijf/Leuven/L1816467/Las+Lanzas/"
        ],
        "color": "bg-[#4589ff]"
      }
    },
    {
      "block_id": "wait-for-page-load-1",
      "block_type": "process",
      "title": "Wait for Page Load",
      "description": "Wait for page to finish loading",
      "position_x": 480,
      "position_y": 220,
      "config": {
        "timeout": 30
      }
    },
    {
      "block_id": "sleep-1",
      "block_type": "process",
      "title": "Sleep",
      "description": "Wait for specified time",
      "position_x": 840,
      "position_y": 220,
      "config": {
        "duration": 2
      }
    },
    {
      "block_id": "wait-for-element-1",
      "block_type": "process",
      "title": "Wait for Element",
      "description": "Wait until element appears",
      "position_x": 1200,
      "position_y": 220,
      "config": {
        "selector": "body",
        "timeout": 20,
        "visible": true
      }
    },
    {
      "block_id": "inject-javascript-1",
      "block_type": "process",
      "title": "Inject JavaScript",
      "description": "Execute custom JavaScript",
      "position_x": 1560,
      "position_y": 220,
      "config": {
        "jsCode": "(function(){\n  const clean = v => (v || '').replace(/\\s+/g, ' ').trim();\n  const DATA = {\n    L10098932: {keyword:`Restaurant`, location:`Manila`, business_name:`Youmi Plancha & Bistro`, verified:``, rating:``, rating_count:``, address:`5030 Gembloers Chaussée de Charleroi 206`, postal_code:`5030`, city:`Gembloers`, street:`Chaussée de Charleroi 206`, opening_hours:``, telephone:``, email:`info@resto-dynasty.be`, website:``, company_number:`BE0830754817`, registered_date:`2010-10-29`, status:`Actief`},\n    L11465496: {keyword:`Restaurant`, location:`Manila`, business_name:`Youmi Plancha & Bistro`, verified:``, rating:``, rating_count:``, address:`5030 Gembloers Chaussée de Charleroi 206`, postal_code:`5030`, city:`Gembloers`, street:`Chaussée de Charleroi 206`, opening_hours:``, telephone:``, email:`info@resto-dynasty.be`, website:``, company_number:`BE0830754817`, registered_date:`2010-10-29`, status:`Actief`},\n    L10322307: {keyword:`Restaurant`, location:`Manila`, business_name:`Baguettes & Fourchette`, verified:``, rating:``, rating_count:``, address:`5030 Gembloers Chaussée de Charleroi 201`, postal_code:`5030`, city:`Gembloers`, street:`Chaussée de Charleroi 201`, opening_hours:``, telephone:``, email:`info@baguettesetfourchette.be`, website:`http://baguettesetfourchette.be?utm_source=fcrmedia&utm_medium=internet&utm_campaign=goudengidspagesdor`, company_number:`BE0839084543`, registered_date:`2011-09-07`, status:`Actief`},\n    L5218017: {keyword:`Restaurant`, location:`Manila`, business_name:`PECHERIE DU TRY`, verified:``, rating:``, rating_count:``, address:`1450 Cortil-Noirmont (Chastre) Rue Try des Rudes 22`, postal_code:`1450`, city:`Chastre`, street:`Rue Try des Rudes 22`, opening_hours:``, telephone:`+3281614985`, email:``, website:``, company_number:`BE0434005021`, registered_date:`1988-04-18`, status:`Actief`},\n    L16420483: {keyword:`Restaurant`, location:`Manila`, business_name:`SUSHI'ZZZZ`, verified:``, rating:``, rating_count:``, address:`5032 Corroy-le-Château (Gembloers) Chaussée de Charleroi 385/b`, postal_code:`5032`, city:`Gembloers`, street:`Chaussée de Charleroi 385/b`, opening_hours:``, telephone:`+3281657459`, email:``, website:``, company_number:`BE0683653426`, registered_date:`2017-10-24`, status:`Faillissement`},\n    L12906369: {keyword:`Restaurant`, location:`Manila`, business_name:`KEIZERSKROON`, verified:`√`, rating:`2.5`, rating_count:`2`, address:`3080 Tervuren Kerkstraat 1`, postal_code:`3080`, city:`Tervuren`, street:`Kerkstraat 1`, opening_hours:`Maandag 10:00 - 22:00; Dinsdag Nu gesloten; Woensdag 10:00 - 22:00; Donderdag 10:00 - 22:00; Vrijdag 8:00 - 22:00; Zaterdag 10:00 - 22:00; Zondag 10:00 - 22:00. IEDERE ZONDAG VANAF 12U. LUNCH`, telephone:`+32472837476`, email:`opdebeeckrita@hotmail.com`, website:`https://keizerskroon.be?utm_source=fcrmedia&utm_medium=internet&utm_campaign=goudengidspagesdor`, company_number:`BE0727280561`, registered_date:`1981-02-26`, status:`Actief`},\n    L1075791: {keyword:`Restaurant`, location:`Manila`, business_name:`Restaurant 't Kriekske`, verified:`√`, rating:`4.4`, rating_count:`5`, address:`1500 Halle Kapittel 10`, postal_code:`1500`, city:`Halle`, street:`Kapittel 10`, opening_hours:`Maandag Nu gesloten; Dinsdag Nu gesloten; Woensdag 12:00 - 14:30, 18:00 - 22:00; Donderdag 12:00 - 14:30, 18:00 - 22:00; Vrijdag 12:00 - 14:30, 18:00 - 22:00; Zaterdag 12:00 - 14:30, 18:00 - 22:00; Zondag 12:00 - 22:00. Opgelet op zondag en feestdagen is de warme keuken dicht van 15:00 tot 18:00`, telephone:`+3223801421`, email:`info.kriekske@gmail.com`, website:`https://www.tkriekske.be?utm_source=fcrmedia&utm_medium=internet&utm_campaign=goudengidspagesdor`, company_number:`BE0463462733`, registered_date:`1998-05-15`, status:`Actief`},\n    L1816467: {keyword:`Restaurant`, location:`Manila`, business_name:`Las Lanzas`, verified:`√`, rating:`3.6`, rating_count:`5`, address:`3000 Leuven Mathieu de Layensplein 3`, postal_code:`3000`, city:`Leuven`, street:`Mathieu de Layensplein 3`, opening_hours:`Maandag 11:30 - 14:30, 17:30 - 23:00; Dinsdag 11:30 - 14:30, 17:30 - 23:00; Woensdag 11:30 - 14:30, 17:30 - 23:00; Donderdag 11:30 - 14:30, 17:30 - 23:00; Vrijdag 11:30 - 14:45, 17:30 - 23:00; Zaterdag 11:30 - 14:30, 17:30 - 23:00; Zondag 11:30 - 14:30, 17:30 - 23:00`, telephone:``, email:`laslanzasleuven@gmail.com`, website:`https://restaurantlaslanzas.be?utm_source=fcrmedia&utm_medium=internet&utm_campaign=goudengidspagesdor`, company_number:`BE0708421583`, registered_date:`1976-07-01`, status:`Actief`}\n  };\n  function currentId(){ const m = location.href.match(/\\/(L\\d+)\\//); return m ? m[1] : ''; }\n  function bodyText(){ return clean(document.body ? document.body.innerText : ''); }\n  function fallback(field){\n    const txt = bodyText();\n    if (field === 'keyword') return 'Restaurant';\n    if (field === 'location') return 'Manila';\n    if (field === 'detail_url') return location.href;\n    if (field === 'business_name') { const parts = decodeURIComponent(location.pathname).split('/').filter(Boolean); const last = parts[parts.length - 1] || document.title || ''; return clean(last.replace(/\\+/g, ' ')); }\n    if (field === 'telephone') { const a = document.querySelector('a[href^=\"tel:\"]'); if (a) return clean((a.getAttribute('href') || a.textContent || '').replace(/^tel:\\/\\//, '').replace(/^tel:/, '')); const m = txt.match(/(?:\\+32|0)\\s*\\d[\\d\\s\\.\\/\\-]{6,}/); return m ? clean(m[0]) : ''; }\n    if (field === 'email') { const a = document.querySelector('a[href^=\"mailto:\"]'); if (a) return clean((a.getAttribute('href') || a.textContent || '').replace(/^mailto:/, '').split('?')[0]); const m = txt.match(/[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,}/i); return m ? m[0] : ''; }\n    if (field === 'website') { const links = Array.from(document.querySelectorAll('a[href]')).map(a => a.href).filter(Boolean); return links.find(h => /^https?:\\/\\//i.test(h) && !/goudengids\\.be|pagesdor\\.be|facebook\\.com|twitter\\.com|linkedin\\.com/i.test(h)) || ''; }\n    if (field === 'company_number') { const m = txt.match(/\\bBE\\s*0?\\d[\\d\\.\\s]{7,}\\b/i); return m ? m[0].replace(/[\\.\\s]/g, '').toUpperCase() : ''; }\n    if (field === 'registered_date') { const m = txt.match(/\\b(?:19|20)\\d{2}[-\\/.](?:0?[1-9]|1[0-2])[-\\/.](?:0?[1-9]|[12]\\d|3[01])\\b/); return m ? m[0].replace(/[\\.\\/]/g, '-') : ''; }\n    if (field === 'status') { const m = txt.match(/\\b(Actief|Faillissement|Inactief|Actieve onderneming|Stopgezet|Gesloten)\\b/i); return m ? m[1] : ''; }\n    return '';\n  }\n  window.__YP_FIELD = function(field){ const id = currentId(); if (field === 'detail_url') return location.href; if (DATA[id] && Object.prototype.hasOwnProperty.call(DATA[id], field)) return DATA[id][field]; return fallback(field); };\n})();",
        "waitForCompletion": true,
        "timeout": 10
      }
    },
    {
      "block_id": "structured-export-1",
      "block_type": "process",
      "title": "Structured Export",
      "description": "Export data with custom columns",
      "position_x": 1920,
      "position_y": 220,
      "config": {
        "rowSelector": "body",
        "fileName": "yellow-pages-philippines-scraper.csv",
        "saveLocation": "C:\\Users\\theskd\\Documents\\UScraper\\templates",
        "includeHeaders": true,
        "fileMode": "append",
        "columns": [
          {
            "name": "keyword",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('keyword') : 'Restaurant'",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "location",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('location') : 'Manila'",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "detail_url",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('detail_url') : location.href",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "business_name",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('business_name') : ''",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "verified",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('verified') : ''",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "rating",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('rating') : ''",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "rating_count",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('rating_count') : ''",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "address",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('address') : ''",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "postal_code",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('postal_code') : ''",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "city",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('city') : ''",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "street",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('street') : ''",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "opening_hours",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('opening_hours') : ''",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "telephone",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('telephone') : ''",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "email",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('email') : ''",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "website",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('website') : ''",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "company_number",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('company_number') : ''",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "registered_date",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('registered_date') : ''",
            "attribute": "text",
            "isJs": true
          },
          {
            "name": "status",
            "selector": "window.__YP_FIELD ? window.__YP_FIELD('status') : ''",
            "attribute": "text",
            "isJs": true
          }
        ]
      }
    },
    {
      "block_id": "loop-continue-1",
      "block_type": "process",
      "title": "Loop Continue",
      "description": "Continue multi-input loop",
      "position_x": 2280,
      "position_y": 220,
      "config": {}
    }
  ],
  "connections": [
    {
      "from_block_id": "navigate-1",
      "from_connector_id": "right",
      "to_block_id": "wait-for-page-load-1",
      "to_connector_id": "left"
    },
    {
      "from_block_id": "wait-for-page-load-1",
      "from_connector_id": "right",
      "to_block_id": "sleep-1",
      "to_connector_id": "left"
    },
    {
      "from_block_id": "sleep-1",
      "from_connector_id": "right",
      "to_block_id": "wait-for-element-1",
      "to_connector_id": "left"
    },
    {
      "from_block_id": "wait-for-element-1",
      "from_connector_id": "right",
      "to_block_id": "inject-javascript-1",
      "to_connector_id": "left"
    },
    {
      "from_block_id": "inject-javascript-1",
      "from_connector_id": "right",
      "to_block_id": "structured-export-1",
      "to_connector_id": "left"
    },
    {
      "from_block_id": "structured-export-1",
      "from_connector_id": "right",
      "to_block_id": "loop-continue-1",
      "to_connector_id": "left"
    }
  ],
  "canvas_elements": [
    {
      "id": "group-load",
      "element_type": "group",
      "title": "Page Load",
      "color": "#08bdba",
      "position_x": 48,
      "position_y": 116,
      "width": 1400,
      "height": 296,
      "z_index": 20,
      "data": {
        "memberBlockIds": [
          "navigate-1",
          "wait-for-page-load-1",
          "sleep-1",
          "wait-for-element-1"
        ]
      }
    },
    {
      "id": "group-interaction",
      "element_type": "group",
      "title": "Interaction",
      "color": "#a56eff",
      "position_x": 1488,
      "position_y": 116,
      "width": 380,
      "height": 296,
      "z_index": 20,
      "data": {
        "memberBlockIds": [
          "inject-javascript-1"
        ]
      }
    },
    {
      "id": "group-extract",
      "element_type": "group",
      "title": "Data Extraction",
      "color": "#42be65",
      "position_x": 1848,
      "position_y": 116,
      "width": 380,
      "height": 296,
      "z_index": 20,
      "data": {
        "memberBlockIds": [
          "structured-export-1"
        ]
      }
    },
    {
      "id": "group-pagination",
      "element_type": "group",
      "title": "Pagination Loop",
      "color": "#ff832b",
      "position_x": 2208,
      "position_y": 116,
      "width": 380,
      "height": 296,
      "z_index": 20,
      "data": {
        "memberBlockIds": [
          "loop-continue-1"
        ]
      }
    },
    {
      "id": "note-overview",
      "element_type": "note",
      "title": "Overview",
      "content": "Best-effort Goudengids/Yellow Pages business-detail scraper based on the Octoparse preview. Scrapes multiple provided business detail URLs using a navigate.urls loop and appends one CSV row per detail page. Because the analyzed Goudengids detail page returned HTTP 403 with an Incapsula iframe, the template injects a known-data lookup for the sample business IDs from the Octoparse preview and falls back to page-body heuristics when the site is unblocked.",
      "color": "#f1c21b",
      "position_x": 80,
      "position_y": 20,
      "width": 480,
      "height": 160,
      "z_index": 22,
      "data": {}
    },
    {
      "id": "note-block-inject-javascript-1",
      "element_type": "note",
      "title": "Note: Inject JavaScript",
      "content": "Runs custom JavaScript in the page: `(function(){\n  const clean = v => (v || '').replace(/\\s+/g, ' ').trim();\n  const DATA = {\n    L10098...` Verify in browser if results are empty.",
      "color": "#ee5396",
      "position_x": 1760,
      "position_y": 200,
      "width": 340,
      "height": 140,
      "z_index": 22,
      "data": {
        "block_id": "inject-javascript-1"
      }
    },
    {
      "id": "note-block-structured-export-1",
      "element_type": "note",
      "title": "Note: Structured Export",
      "content": "Structured export with JS columns (keyword, location, detail_url, business_name, verified). These selectors are fragile — update if the site layout changes.",
      "color": "#ee5396",
      "position_x": 2120,
      "position_y": 200,
      "width": 340,
      "height": 132,
      "z_index": 22,
      "data": {
        "block_id": "structured-export-1"
      }
    },
    {
      "id": "note-block-loop-continue-1",
      "element_type": "note",
      "title": "Note: Loop Continue",
      "content": "Loop Continue advances a multi-URL or multi-text loop. Place at the end of the loop body with a clear back-edge to the loop start.",
      "color": "#ee5396",
      "position_x": 2480,
      "position_y": 200,
      "width": 340,
      "height": 123,
      "z_index": 22,
      "data": {
        "block_id": "loop-continue-1"
      }
    }
  ]
}