职位描述
SqlPyQtPostgreSQL综合商贸产业互联网平台3C数码批发/零售/贸易
Primary focus: web crawling
system, intelligence data warehouse, data pipelines. This is the engineer who
builds the moat.
system, intelligence data warehouse, data pipelines. This is the engineer who
builds the moat.
Skillset Requirements
Essential Skills
1.Go--Production Go: goroutines, channels, context handling, error patterns. Strong preference — Go is the right language for crawlers and pipeline services.
2.Python Python 3.10+ for data processing, glue code, ML adjacent work.
3.Web crawling--Built production crawlers handling 100K+ pages: politeness, robots.txt, retries, deduplication
4.Headless browsers--Playwright, Puppeteer, Chrome dp, or Selenium for JavaScript-rendered sites
5.HTML parsing & content extraction--goquery, BeautifulSoup, trafilatura, readability — extracting clean text from messy HTML.
6.Distributed queues--NATS, RabbitMQ, Redis Streams, or Kafka — job orchestration at scale
7.PostgreSQL Schema design, partitioning, indexing for large datasets.
7.PostgreSQL Schema design, partitioning, indexing for large datasets.
8.Docker & Linux--Containerized services, systemd, Linux performance debugging.
9.Git--Branching, PRs, code review
10.English--Reading technical docs, writing code comments and PR descriptions in English
Other skills
1.Rust--Rust for performance-critical pipeline components (Tokio, async runtimes).
2.Vector databases Milvus, Qdrant, Weaviate — index design and bulk loading.
3.Stream processing--Kafka Streams, Flink, NATS JetStream, or similar real-time pipelines.
4.Change data capture (CDC)-- Debezium or similar; incremental data ingestion patterns.
5.Embeddings & chunking--Sentence-transformers, document chunking strategies for RAG.
6.Observability--Prometheus, Grafana, OpenTelemetry, structured logging.
7.Proxy & anti-bot handling--Proxy rotation, residential proxies, CAPTCHA strategies.
8.LangChain / LangGraph--Multi-step agent workflows for retrieval routing.
9.Bahasa Indonesia / SEA language familiarity.Helps with crawler targeting; not required
工作地点
深圳龙岗区坂田街道

认证资质
营业执照信息

更新于 今天






