PinnedRanking document similarity at scale with Spark NLPCombining the power of Spark NLP sentence embeddings and LSH approximate nearest neighbors search pipelines to catch contextual and…Jul 2, 2023Jul 2, 2023
Spark NLP Document Similarity Ranker as-retriever for RAG tasksBreaking news! Spark NLP (https://sparknlp.org/) gets enhanced with a new DocumentSimilarityRanker as-retriever interface for your RAG…Mar 18, 2024Mar 18, 2024
Polars is all you need: SQL chapterI just found a powerful Python SQL API for my data analysisMay 20, 2023May 20, 2023
Published inspark-nlpCleaning and extracting content from HTML/XML documents using Spark NLPSpark NLP is an open-source text processing library for advanced natural language processing for the Python, Java and Scala programming…Jan 13, 2021Jan 13, 2021
Reliable and serverless data ingestion using Delta Lake on AWS GlueThe Big Data scenarioJul 21, 2020Jul 21, 2020