A séma az aktuális config alapján rajzolódik; egy régebbi futtatás más orchestrátorral készülhetett. BPMN-szerű jelölés: zöld kör = kezdőesemény, kék + rombusz = párhuzamos fork/join, lekerekített téglalap = task (szín = státusz), dupla piros kör = befejezés. Egér a task fölött: részletek (státusz, időpontok, hiba).
| Stage | Státusz | Indult | Befejezve | Időtartam | Hiba |
|---|---|---|---|---|---|
| data_extraction | failed | — | 2026-03-29 10:53:57 | — | Async discovery: no BERT candidate URL produced valid content. Tried 0 URL(s). |
| discovery_fetch_validation | failed | 2026-03-29 10:50:13 | 2026-03-29 10:53:56 | 3 min 43 s | Async discovery: no BERT candidate URL produced valid content. Tried 0 URL(s). |
| metadata_alt | completed | 2026-03-29 10:50:13 | 2026-03-29 10:53:56 | 3 min 42 s | — |
| reviews | completed | 2026-03-29 10:50:54 | 2026-03-29 10:53:56 | 3 min 2 s | — |
| taxonomy_enrichment_alt | failed | — | 2026-03-29 10:53:57 | — | Async discovery: no BERT candidate URL produced valid content. Tried 0 URL(s). |
A futtatás hibával zárult — részletek a stage táblázatban.
{
"execution_id": "34b3d4b3-5f4a-43ff-b3a3-3ccbf42e41d0",
"input_url": "https://apponyi.eu/",
"state_filename": "20260329_105013_apponyi_eu.json",
"created_at": "2026-03-29T10:50:13.121388",
"updated_at": "2026-03-29T10:53:57.200536",
"stages": {
"metadata_alt": {
"stage_name": "metadata_alt",
"status": "completed",
"started_at": "2026-03-29T10:50:13.638798",
"completed_at": "2026-03-29T10:53:56.602216",
"result": {
"metadata": {
"company_name": "Apponyi Magánklinika",
"description": "Az Apponyi Magánklinika egy több szakterületet lefedő magánklinika Kaposváron. A magánrendelés 1992-ben indult a betegek kezdeményezésére, az épületet Lőrincz Ferenc Ybl-díjas építész tervei alapján 1998. május 1-én nyitották meg. A klinikán belül járóbeteg-szolgáltatások és diagnosztikai egységek működnek, többek között teljes körű ultrahang-diagnosztika és terheléses EKG rendszer. Számos szakterület elérhető (pl. belgyógyászat, kardiológia, ideggyógyászat, bőrgyógyászat, urológia, sebészet), valamint laborvizsgálatok és online előjegyzés. A honlap részletes tájékoztatást ad az orvosokról, rendelési időkről és a szolgáltatásokról; a klinika koordinátora Dr. Rucz Károly.",
"arlista_url": "https://www.apponyi.eu/arak",
"varos": "Kaposvár",
"iranyitoszam": "7400",
"utca": "Gróf Apponyi Albert u. 41.",
"telefonszam": "82/311-519",
"email": "klinika@somogy.hu",
"website": "https://apponyi.eu/"
},
"llm_usage": {
"prompt_tokens": 18235,
"completion_tokens": 1742,
"total_tokens": 19977,
"cost": 0.00804275
}
},
"error": null,
"metadata": {}
},
"discovery_fetch_validation": {
"stage_name": "discovery_fetch_validation",
"status": "failed",
"started_at": "2026-03-29T10:50:13.782701",
"completed_at": "2026-03-29T10:53:56.908254",
"result": null,
"error": "Async discovery: no BERT candidate URL produced valid content. Tried 0 URL(s).",
"metadata": {}
},
"reviews": {
"stage_name": "reviews",
"status": "completed",
"started_at": "2026-03-29T10:50:54.084088",
"completed_at": "2026-03-29T10:53:56.751852",
"result": {
"reviews": {
"company_name": "Apponyi Magánklinika",
"total_reviews": 3,
"average_rating": 5,
"reviews": [
{
"author": "Zsolt Szeness",
"rating": 5,
"text": "Dr. Apponyi Dániel ügyvéd úr, segítségével gördülékenyen és könnyen zárult a volt munkáltatómmal való, munkaügyi jogvita, peren kívüli megegyezéssel.",
"date": null
}
],
"source": "google-maps-scraper",
"postal_code": "",
"city": "",
"street": "",
"phone": ""
}
},
"error": null,
"metadata": {}
},
"data_extraction": {
"stage_name": "data_extraction",
"status": "failed",
"started_at": null,
"completed_at": "2026-03-29T10:53:57.054608",
"result": null,
"error": "Async discovery: no BERT candidate URL produced valid content. Tried 0 URL(s).",
"metadata": {}
},
"taxonomy_enrichment_alt": {
"stage_name": "taxonomy_enrichment_alt",
"status": "failed",
"started_at": null,
"completed_at": "2026-03-29T10:53:57.200521",
"result": null,
"error": "Async discovery: no BERT candidate URL produced valid content. Tried 0 URL(s).",
"metadata": {}
}
},
"overall_status": "failed",
"current_stage": "taxonomy_enrichment_alt",
"resume_from_stage": null,
"llm_usage_summary": null
}
Forrás: data/logs — név szerint illeszkedő .log fájlok (API/orchestrator: 34b3d4b3-5f4a-43ff-b3a3-3ccbf42e41d0_*.log, CLI: pipeline_34b3d4b3_*.log).
data/logs/34b3d4b3-5f4a-43ff-b3a3-3ccbf42e41d0_20260329_105013.log
2026-03-29 10:50:13 | INFO | prefect.pipeline.parallel | Starting parallel pipeline execution 34b3d4b3-5f4a-43ff-b3a3-3ccbf42e41d0 for URL: https://apponyi.eu/
2026-03-29 10:50:13 | INFO | src.stages.stage_1_metadata_alt | Starting alternative metadata extraction stage
2026-03-29 10:50:13 | INFO | src.stages.stage_2_discovery_async | Starting discovery-fetch-validation (async) for URL: https://apponyi.eu/
2026-03-29 10:50:13 | INFO | src.stages.stage_1_metadata_alt | Querying metadata for: https://apponyi.eu/
2026-03-29 10:50:13 | INFO | src.stages.stage_2_discovery_async | Async discovery config: fetch=curl, output=html, prediction=http://docker-host:8000/predict
2026-03-29 10:50:13 | INFO | src.stages.stage_2_discovery_async | Async crawl starting: https://apponyi.eu/ (max_depth=2, max_concurrent=10)
2026-03-29 10:50:13 | INFO | src.stages.stage_1_metadata_alt | Downloading main URL: https://apponyi.eu/
2026-03-29 10:50:14 | INFO | src.stages.stage_2_discovery_async | Crawled (depth 0): https://apponyi.eu/
2026-03-29 10:50:14 | INFO | src.stages.stage_2_discovery_async | Crawl finished: 2 URLs in 0.3s (success=1, errors=0)
2026-03-29 10:50:14 | INFO | src.stages.stage_1_metadata_alt | Successfully extracted 985 characters from main URL
2026-03-29 10:50:14 | INFO | src.stages.stage_1_metadata_alt | Searching for contact pages using OpenSerp
2026-03-29 10:50:14 | INFO | src.stages.stage_1_metadata_alt | Trying OpenSerp API: http://openserp:7000/mega/search with params: {'text': 'cím kapcsolat telefonszám', 'site': 'apponyi.eu', 'limit': '3', 'lang': 'HU'}
2026-03-29 10:50:15 | INFO | src.stages.stage_2_discovery_async | Crawl produced 0 URLs from BERT (threshold and above), fetching all
2026-03-29 10:50:15 | ERROR | src.stages.stage_2_discovery_async | Async discovery: no BERT candidate URL produced valid content. Tried 0 URL(s).
2026-03-29 10:50:15 | INFO | src.stages.stage_2_discovery_async | Attempting fallback: original URL with trafilatura+markdown
2026-03-29 10:50:35 | INFO | src.stages.stage_1_metadata_alt | Successfully connected to OpenSerp at http://openserp:7000/mega/search
2026-03-29 10:50:35 | INFO | src.stages.stage_1_metadata_alt | OpenSerp returned 4 results
2026-03-29 10:50:35 | INFO | src.stages.stage_1_metadata_alt | Result 1: Adatvédelmi nyilatkozat - Kaposvár - https://www.apponyi.eu/adatvedelmi-nyilatkozat
2026-03-29 10:50:35 | INFO | src.stages.stage_1_metadata_alt | Result 2: Orvosaink - Apponyi Magánklinika Kaposvár - http://www.apponyi.eu/orvosaink
2026-03-29 10:50:35 | INFO | src.stages.stage_1_metadata_alt | Result 3: Szolgáltatások - Apponyi Magánklinika Kaposvár - https://www.apponyi.eu/szolgaltatasok
2026-03-29 10:50:35 | INFO | src.stages.stage_1_metadata_alt | Trying to download contact page 1/3: https://www.apponyi.eu/adatvedelmi-nyilatkozat
2026-03-29 10:50:35 | INFO | src.stages.stage_1_metadata_alt | Successfully downloaded and converted 17041 characters from contact page 1
2026-03-29 10:50:35 | INFO | src.stages.stage_1_metadata_alt | Trying to download contact page 2/3: http://www.apponyi.eu/orvosaink
2026-03-29 10:50:36 | INFO | src.stages.stage_1_metadata_alt | Successfully downloaded and converted 28134 characters from contact page 2
2026-03-29 10:50:36 | INFO | src.stages.stage_1_metadata_alt | Trying to download contact page 3/3: https://www.apponyi.eu/szolgaltatasok
2026-03-29 10:50:36 | INFO | src.stages.stage_1_metadata_alt | Successfully downloaded and converted 3271 characters from contact page 3
2026-03-29 10:50:36 | INFO | src.stages.stage_1_metadata_alt | Calling OpenRouter for metadata extraction (model=openai/gpt-5-mini)
2026-03-29 10:50:53 | INFO | src.stages.stage_1_metadata_alt | Successfully extracted metadata for: Apponyi Magánklinika
2026-03-29 10:50:53 | INFO | src.stages.stage_1_metadata_alt | Alternative metadata extraction stage completed
2026-03-29 10:50:54 | INFO | src.stages.stage_4_reviews | Starting reviews scraping stage
2026-03-29 10:50:54 | INFO | src.stages.stage_4_reviews | Found metadata directly: company_name=Apponyi Magánklinika, varos=Kaposvár
2026-03-29 10:50:54 | INFO | src.stages.stage_4_reviews | input_path: /tmp/tmpizlrp9r0
2026-03-29 10:50:54 | INFO | src.stages.stage_4_reviews | output_path: /tmp/tmpzvhxta_l
2026-03-29 10:50:54 | INFO | src.stages.stage_4_reviews | Running google-maps-scraper (attempt 1/3)
2026-03-29 10:51:49 | INFO | prefect.pipeline.parallel | Starting parallel pipeline execution 41e5d749-46e4-42c5-a37b-cb7bd1570bf6 for URL: https://apponyi.eu/
2026-03-29 10:51:49 | INFO | src.stages.stage_2_discovery_async | Starting discovery-fetch-validation (async) for URL: https://apponyi.eu/
2026-03-29 10:51:49 | INFO | src.stages.stage_2_discovery_async | Async discovery config: fetch=curl, output=html, prediction=http://docker-host:8000/predict
2026-03-29 10:51:49 | INFO | src.stages.stage_2_discovery_async | Async crawl starting: https://apponyi.eu/ (max_depth=2, max_concurrent=10)
2026-03-29 10:51:49 | INFO | src.stages.stage_1_metadata_alt | Starting alternative metadata extraction stage
2026-03-29 10:51:49 | INFO | src.stages.stage_1_metadata_alt | Querying metadata for: https://apponyi.eu/
2026-03-29 10:51:49 | INFO | src.stages.stage_1_metadata_alt | Downloading main URL: https://apponyi.eu/
2026-03-29 10:51:49 | INFO | src.stages.stage_2_discovery_async | Crawled (depth 0): https://apponyi.eu/
2026-03-29 10:51:49 | INFO | src.stages.stage_2_discovery_async | Crawl finished: 2 URLs in 0.2s (success=1, errors=0)
2026-03-29 10:51:49 | INFO | src.stages.stage_1_metadata_alt | Successfully extracted 985 characters from main URL
2026-03-29 10:51:49 | INFO | src.stages.stage_1_metadata_alt | Searching for contact pages using OpenSerp
2026-03-29 10:51:49 | INFO | src.stages.stage_1_metadata_alt | Trying OpenSerp API: http://openserp:7000/mega/search with params: {'text': 'cím kapcsolat telefonszám', 'site': 'apponyi.eu', 'limit': '3', 'lang': 'HU'}
2026-03-29 10:51:50 | INFO | src.stages.stage_2_discovery_async | Crawl produced 0 URLs from BERT (threshold and above), fetching all
2026-03-29 10:51:50 | ERROR | src.stages.stage_2_discovery_async | Async discovery: no BERT candidate URL produced valid content. Tried 0 URL(s).
2026-03-29 10:51:50 | INFO | src.stages.stage_2_discovery_async | Attempting fallback: original URL with trafilatura+markdown
2026-03-29 10:52:11 | INFO | src.stages.stage_1_metadata_alt | Successfully connected to OpenSerp at http://openserp:7000/mega/search
2026-03-29 10:52:11 | INFO | src.stages.stage_1_metadata_alt | OpenSerp returned 2 results
2026-03-29 10:52:11 | INFO | src.stages.stage_1_metadata_alt | Result 1: Adatvédelmi nyilatkozat - Kaposvár - https://www.apponyi.eu/adatvedelmi-nyilatkozat
2026-03-29 10:52:11 | INFO | src.stages.stage_1_metadata_alt | Result 2: PDF HONLAPRA - Apponyi - https://www.apponyi.eu/GINOP-3.2.2-8-2-4-16-2019-01873.pdf
2026-03-29 10:52:11 | INFO | src.stages.stage_1_metadata_alt | Trying to download contact page 1/3: https://www.apponyi.eu/adatvedelmi-nyilatkozat
2026-03-29 10:52:11 | INFO | src.stages.stage_1_metadata_alt | Successfully downloaded and converted 17041 characters from contact page 1
2026-03-29 10:52:11 | INFO | src.stages.stage_1_metadata_alt | Trying to download contact page 2/3: https://www.apponyi.eu/GINOP-3.2.2-8-2-4-16-2019-01873.pdf
2026-03-29 10:52:11 | INFO | src.stages.stage_1_metadata_alt | Successfully downloaded and converted 243645 characters from contact page 2
2026-03-29 10:52:11 | INFO | src.stages.stage_1_metadata_alt | Calling OpenRouter for metadata extraction (model=openai/gpt-5-mini)
2026-03-29 10:52:19 | INFO | src.stages.stage_1_metadata_alt | Successfully extracted metadata for: Apponyi Magánklinika
2026-03-29 10:52:19 | INFO | src.stages.stage_1_metadata_alt | Alternative metadata extraction stage completed
2026-03-29 10:52:20 | INFO | src.stages.stage_4_reviews | Starting reviews scraping stage
2026-03-29 10:52:20 | INFO | src.stages.stage_4_reviews | Found metadata directly: company_name=Apponyi Magánklinika, varos=Kaposvár
2026-03-29 10:52:20 | INFO | src.stages.stage_4_reviews | input_path: /tmp/tmp2q7gdymr
2026-03-29 10:52:20 | INFO | src.stages.stage_4_reviews | output_path: /tmp/tmp_23cidu3
2026-03-29 10:52:20 | INFO | src.stages.stage_4_reviews | Running google-maps-scraper (attempt 1/3)
2026-03-29 10:53:56 | INFO | src.stages.stage_4_reviews | google-maps-scraper completed successfully on attempt 1
2026-03-29 10:53:56 | INFO | src.stages.stage_4_reviews | Input fájl mentve: data/review/20260329_105356_apponyi_magánklinika_url_input.txt
2026-03-29 10:53:56 | INFO | src.stages.stage_4_reviews | Output fájl mentve: data/review/20260329_105356_apponyi_magánklinika_url_output.json
2026-03-29 10:53:56 | INFO | src.stages.stage_4_reviews | Reviews scraping completed. Found 1 reviews
2026-03-29 10:53:56 | INFO | prefect.pipeline.parallel | Branch 1 (metadata_alt -> reviews) completed successfully
2026-03-29 10:53:56 | ERROR | prefect.pipeline.parallel | Branch 2 failed: Async discovery: no BERT candidate URL produced valid content. Tried 0 URL(s).