Subtopic Deep Dive

Information Retrieval in E-Commerce
Research Guide

What is Information Retrieval in E-Commerce?

Information Retrieval in E-Commerce applies search techniques like clustering, text analytics, and vector space models to enhance product discovery and user targeting on online platforms.

This subtopic covers methods such as K-means clustering for user characteristics (Habibi and Cahyo, 2019, 24 citations) and weighted inverse document frequency for search engines (Pratama et al., 2020, 15 citations). Researchers use datasets from social media and e-commerce sites to evaluate relevance. Over 10 papers from 2017-2022 address these techniques with citation counts up to 27.

14
Curated Papers
3
Key Challenges

Why It Matters

These methods improve product recommendations on platforms like Instagram, where 70% of users search for products (Habibi and Cahyo, 2019). Clustering user traits via hashtags boosts targeted advertising (Habibi and Cahyo, 2019), while stemming and vector models enhance query matching in niche searches like hadith databases adaptable to e-commerce (Ibrahim, 2014; Pratama et al., 2020). In trillion-dollar markets, better retrieval lifts conversion rates by refining relevance metrics.

Key Research Challenges

Sparse E-Commerce Queries

Short user queries on shopping platforms lack context, reducing retrieval accuracy (Anistya and Setiawan, 2021). Feature expansion with GloVe helps but struggles with domain-specific terms. Evaluation on real shopping data shows NDCG gaps.

User Personalization Scaling

Clustering millions of users with K-means demands efficient scaling (Maylawati et al., 2020; Habibi and Cahyo, 2019). High-dimensional hashtag data causes overfitting. Real-time personalization remains computationally intensive.

Relevance Metric Evaluation

Standard metrics like NDCG fail on diverse e-commerce datasets with subjective relevance (Saikin et al., 2021). Sentiment from reviews adds noise to ranking. Validating models requires large annotated shopping corpora.

Essential Papers

1.

Data science for digital culture improvement in higher education using K-means clustering and text analytics

Dian Sa’adillah Maylawati, Tedi Priatna, Hamdan Sugilar et al. · 2020 · International Journal of Power Electronics and Drive Systems/International Journal of Electrical and Computer Engineering · 27 citations

This study aims to investigate the meaningful pattern that can be used to improve digital culture in higher education based on parameters of the technology acceptance model (TAM). The methodology u...

2.

Clustering User Characteristics Based on the influence of Hashtags on the Instagram Platform

Muhammad Habibi, Puji Winar Cahyo · 2019 · IJCCS (Indonesian Journal of Computing and Cybernetics Systems) · 24 citations

Instagram is a social media that has the potential to be used to increase awareness of a product. Approximately 70% of users spend their time searching for a product on Instagram. Many people promo...

3.

Optimization of Support Vector Machine Method Using Feature Selection to Improve Classification Results

Saikin Saikin, Sofiansyah Fadli, M.Kom Ahmad Ashari et al. · 2021 · JISA(Jurnal Informatika dan Sains) · 23 citations

The performance of the organizations or companiesare based on the qualities possessed by their employee. Both of good or bad employee performance will have an impact on productivity and the impact ...

4.

Sentiment Analysis on PeduliLindungi Application Using TextBlob and VADER Library

Fathonah Illia, Migunani Puspita Eugenia, Sita Aliya Rutba · 2022 · Proceedings of The International Conference on Data Science and Official Statistics · 21 citations

The Covid-19 virus has become a global pandemic, including Indonesia. Various efforts have been made by the Government to reduce the negative impact by this pandemic, one of which is through the Pe...

5.

Hate Speech Detection on Twitter in Indonesia with Feature Expansion Using GloVe

Febiana Anistya, Erwin Budi Setiawan · 2021 · Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) · 21 citations

Twitter is one of the popular social media to channel opinions in the form of criticism and suggestions. Criticism could be a form of hate speech if the criticism implies attacking something (an in...

6.

Balinese Historian Chatbot using Full-Text Search and Artificial Intelligence Markup Language Method

Kadek Teguh Wirawan, I Made Sukarsa, I Putu Agung Bayupati · 2019 · International Journal of Intelligent Systems and Applications · 17 citations

In the era of technology, various information could be obtained quickly and easily.The history of Bali is one of the information that could be obtained.Balinese have known their history through Bab...

7.

Online Newspaper Clustering in Aceh using the Agglomerative Hierarchical Clustering Method

Rizal Tjut Adek, Rozzy Kesuma Dinata, Ananda Ditha · 2021 · International Journal of Engineering Science and Information Technology · 16 citations

The rapid progress in the field of information technology, especially the internet, has given birth to a lot of information. The ease of publishing an article on a website causes an explosion of ne...

Reading Guide

Foundational Papers

Start with Ibrahim (2014) on prefix stemming for query normalization, then Sethi (1987) on natural language interfaces applicable to e-commerce search understanding.

Recent Advances

Study Habibi and Cahyo (2019) for hashtag clustering, Pratama et al. (2020) for vector models, and Anistya and Setiawan (2021) for GloVe expansion in relevance ranking.

Core Methods

Core techniques include K-means clustering (Habibi and Cahyo, 2019), support vector machines with feature selection (Saikin et al., 2021), weighted IDF vector space (Pratama et al., 2020), and GloVe embeddings (Anistya and Setiawan, 2021).

How PapersFlow Helps You Research Information Retrieval in E-Commerce

Discover & Search

Research Agent uses searchPapers and exaSearch to find core papers like 'Clustering User Characteristics...Instagram' by Habibi and Cahyo (2019), then citationGraph reveals 24-citation connections to Pratama et al. (2020) on vector models. findSimilarPapers expands to related clustering works for e-commerce personalization.

Analyze & Verify

Analysis Agent applies readPaperContent to extract K-means implementations from Maylawati et al. (2020), verifies claims with CoVe on TAM parameters, and runs PythonAnalysis with NumPy/pandas to recompute clustering metrics. GRADE scores evidence strength for relevance in shopping datasets.

Synthesize & Write

Synthesis Agent detects gaps in personalization scaling across Habibi (2019) and Saikin (2021), flags contradictions in feature selection. Writing Agent uses latexEditText, latexSyncCitations for 10-paper reviews, and latexCompile to generate NDCG evaluation reports with exportMermaid diagrams.

Use Cases

"Reimplement K-means clustering from Habibi 2019 for e-commerce user segmentation."

Research Agent → searchPapers('Habibi Instagram clustering') → Analysis Agent → readPaperContent → runPythonAnalysis (pandas NumPy sandbox recreates hashtag clusters) → outputs segmented user CSV with silhouette scores.

"Write LaTeX review of vector space models in e-commerce search like Pratama 2020."

Research Agent → citationGraph → Synthesis Agent → gap detection → Writing Agent → latexEditText + latexSyncCitations (10 papers) + latexCompile → outputs compiled PDF with relevance metric tables.

"Find GitHub code for GloVe feature expansion in product search."

Research Agent → searchPapers('GloVe e-commerce') → Code Discovery → paperExtractUrls → paperFindGithubRepo → githubRepoInspect (Anistya 2021) → outputs verified repo with Indonesian Twitter adaptation scripts for shopping queries.

Automated Workflows

Deep Research workflow scans 50+ papers via searchPapers on 'e-commerce clustering', structures reports with NDCG comparisons from Habibi (2019) and Pratama (2020). DeepScan applies 7-step CoVe to verify K-means TAM results in Maylawati (2020). Theorizer generates hypotheses on hashtag-driven ranking from citationGraph clusters.

Frequently Asked Questions

What defines Information Retrieval in E-Commerce?

It applies clustering, text analytics, and vector models to improve product search and user targeting (Habibi and Cahyo, 2019; Pratama et al., 2020).

What are key methods used?

K-means for user clustering (Maylawati et al., 2020), GloVe feature expansion (Anistya and Setiawan, 2021), and weighted IDF with vector space (Pratama et al., 2020).

What are prominent papers?

Habibi and Cahyo (2019, 24 citations) on Instagram clustering; Pratama et al. (2020, 15 citations) on hadith search adaptable to products; Maylawati et al. (2020, 27 citations) on K-means TAM.

What open problems exist?

Scaling personalization to millions of users, handling sparse queries, and robust NDCG evaluation on diverse shopping data (Saikin et al., 2021; Habibi and Cahyo, 2019).

Research Information Retrieval and Data Mining with AI

PapersFlow provides specialized AI tools for Computer Science researchers. Here are the most relevant for this topic:

See how researchers in Computer Science & AI use PapersFlow

Field-specific workflows, example queries, and use cases.

Computer Science & AI Guide

Start Researching Information Retrieval in E-Commerce with AI

Search 474M+ papers, run AI-powered literature reviews, and write with integrated citations — all in one workspace.

See how PapersFlow works for Computer Science researchers