rybbit/mockdata
2025-04-16 23:58:39 -07:00
..
index.js Optimize session event generation and data handling for improved performance 2025-03-23 23:23:11 -07:00
package-lock.json Refactor Filters component to improve parameter handling and UI clarity 2025-03-23 22:53:45 -07:00
package.json Refactor Filters component to improve parameter handling and UI clarity 2025-03-23 22:53:45 -07:00
README.md Rename project from FrogStats to Rybbit and update related documentation and UI components accordingly. Adjusted code guidelines, modified organization setup prompts, and updated example HTML to reflect the new branding. Added usage documentation and improved self-hosting instructions for clarity. 2025-04-16 23:58:39 -07:00

E-commerce Mock Data Generator for Rybbit

This script generates realistic mock analytics data for Rybbit, simulating user behavior on an e-commerce website called ShopEase.

Features

  • Generates approximately 5 million events per day over a specified number of days
  • Creates realistic user shopping sessions with multiple pageviews and custom events
  • Models realistic e-commerce user journeys (browsing, product views, add to cart, checkout, purchase)
  • Sessions end after 30 minutes of inactivity
  • Distributes traffic realistically throughout the day (with peak hours)
  • Uses Faker.js to generate realistic product data, location information, and more
  • Uses weighted distributions for pages, browsers, operating systems, screen sizes, and more
  • Varies the number of events per day to be more realistic

Requirements

  • Node.js 14+
  • ClickHouse database set up (configuration in .env file)

Installation

  1. Make sure all dependencies are installed:
npm install

Usage

Run the script with optional parameters:

# Default: 30 days with ~5 million events per day
npm run generate

# Custom parameters (days, events per day)
node index.js 15 2000000  # 15 days with ~2 million events per day

Parameters

  1. daysInPast: Number of days in the past to generate data for (default: 30)
  2. eventsPerDay: Approximate number of events per day (default: 5,000,000)

Data Distribution

The script creates realistic distributions of:

  • Page views with appropriate e-commerce user flows
  • Shopping behavior patterns (browsing, cart additions, checkouts, purchases)
  • Geographic distribution (countries and regions using Faker.js)
  • Browser types and versions
  • Operating systems
  • Screen resolutions
  • Referrers (including social media and other e-commerce sites)
  • Custom events specific to e-commerce (product views, add to cart, checkout steps, etc.)
  • Time of day (peak hours vs. off hours)

E-commerce Event Types

The generator includes these e-commerce-specific events:

  • page-view - Basic page view tracking
  • product-view - When a user views a product detail page
  • add-to-cart - When a user adds an item to their cart
  • remove-from-cart - When a user removes an item from their cart
  • begin-checkout - When a user starts the checkout process
  • checkout-step - Tracking each step of the checkout funnel
  • purchase - When a user completes a purchase
  • search - When a user searches for products
  • filter-products - When a user applies filters to product listings
  • add-to-wishlist - When a user adds items to their wishlist
  • And more...

Data Schema

Events are inserted into the ClickHouse pageviews table with the following schema:

  • site_id
  • timestamp
  • session_id
  • user_id
  • hostname
  • pathname
  • querystring
  • page_title
  • referrer
  • browser
  • browser_version
  • operating_system
  • operating_system_version
  • language
  • country
  • iso_3166_2
  • screen_width
  • screen_height
  • device_type
  • type (pageview or custom_event)
  • event_name (for custom events)
  • properties (JSON string for custom events)