Product Introduction
- ArTok is a research paper discovery platform that mimics TikTok's swipe-based interface, enabling random exploration of machine learning papers without algorithmic recommendation loops. It provides four core modes (Random Feed, Trending Papers, Discover Feed, Semantic Search) and integrates papers from multiple academic sources. The platform operates without mandatory user accounts, prioritizes privacy with EU-based servers, and offers instant access to over 50,000 papers.
- The product’s core value lies in breaking filter bubbles through algorithm-free exploration while maintaining rapid discovery speeds (<500ms load times). It combines serendipitous browsing with advanced tools like concept-based semantic search powered by transformer models. Researchers gain a unified interface to access fragmented academic content while retaining full control over their data storage and annotations.
Main Features
- The Random Feed delivers papers through a swipeable interface, pulling content randomly from conferences like NeurIPS and repositories such as arXiv to prevent algorithmic bias. Each swipe triggers an API call to fetch new papers in under 0.5 seconds, with metadata displayed in a standardized card format (title, abstract, source). Users can toggle between "Broad Mode" (all disciplines) or "Focused Mode" (specific subfields like NLP/CV).
- Semantic Search employs a fine-tuned BERT model trained on 1.2 million paper abstracts to enable concept-based queries (e.g., "attention mechanisms in small datasets"). The system indexes papers using 768-dimensional embeddings and supports Boolean operators for precision. Results are ranked by semantic similarity scores, with filters for publication date (2010–2025) and citation count thresholds.
- Multi-source Integration aggregates papers from 15+ conferences (ICML, CVPR), 8 preprint servers (arXiv, bioRxiv), and institutional repositories, updated daily via automated web crawlers. Users can create custom feeds by selecting specific sources or exclude paywalled content. The platform’s unified parser normalizes metadata formats (PDF links, DOI, authors) across sources, reducing manual reconciliation.
Problems Solved
- ArTok eliminates the "recommendation echo chamber" effect prevalent in academic platforms, where users only see papers similar to their past views. Traditional tools like Google Scholar prioritize citation counts and author prominence, whereas ArTok’s randomness ensures exposure to niche or interdisciplinary work. This addresses the 72% researcher dissatisfaction rate with current discovery tools (2023 ACM survey).
- The platform serves machine learning researchers, particularly early-career academics and industry R&D teams needing to track emerging techniques. Bioinformatics specialists and interdisciplinary scientists benefit from cross-domain paper discovery, which conventional topic-specific repositories lack.
- A typical use case involves a researcher preparing a literature review who uses the Random Feed to identify unexpected connections, then switches to Semantic Search with a query like "graph neural networks for drug discovery." They save relevant papers with inline annotations (stored locally as JSON files) and export them as BibTeX for citation management.
Unique Advantages
- Unlike ResearchGate or Semantic Scholar, ArTok completely decouples paper discovery from social metrics (likes/followers) and engagement algorithms. The backend uses a sharded PostgreSQL database partitioned by paper domains, enabling faster random sampling than competitors’ weighted recommendation engines.
- The platform introduces "Privacy-Preserving Personalization," where the Discover Feed uses client-side machine learning (TensorFlow.js) to generate recommendations based on locally stored interaction history. This ensures reading patterns and annotations never leave the user’s device while still enabling tailored suggestions.
- Competitive advantages include GDPR-compliant data handling via Frankfurt-based servers, real-time arXiv integration (new papers appear within 17 minutes of submission), and a zero-latency interface optimized for low-bandwidth scenarios. The tech stack uses Rust-based API servers and WebAssembly modules for near-native performance in browsers.
Frequently Asked Questions (FAQ)
- How does ArTok’s privacy model work? All user annotations and interaction logs are stored exclusively in the browser’s local storage (IndexedDB), encrypted via AES-256 before syncing optional backups to user-controlled cloud drives. Server-side operations only process paper metadata, with IP anonymization and no cookie tracking.
- What languages/models does Semantic Search support? The semantic engine currently indexes English-language papers using a custom DistilBERT variant trained on academic texts, achieving 0.89 precision in concept matching. Support for Chinese/German papers is planned for Q3 2024 using multilingual embeddings.
- Can I integrate ArTok with reference managers? Yes, the platform exports saved papers in BibTeX, RIS, and CSV formats with one-click Zotero/Mendeley integration. An API (beta) allows programmatic access to feeds using Python/R libraries, returning JSON responses with paper metadata and source links.
