> ## Documentation Index
> Fetch the complete documentation index at: https://docs.flipsidecrypto.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Data Freshness & Monitoring

> Monitor data updates and build efficient incremental pipelines in Snowflake

Flipside data shares in Snowflake are continuously updated as blockchain data is produced. This page
shows you how to monitor freshness and build incremental data pipelines using Snowflake SQL.

<Note>
  **Looking for latency targets?** See the [general Data Freshness
  guide](/blockchain-data/data-freshness) for detailed latency expectations by schema type and
  blockchain.
</Note>

## Quick freshness check

Use this SQL query to check how current your data is:

```sql theme={null}
-- Check latest block timestamp for a chain
SELECT
    MAX(block_number) AS latest_block,
    MAX(block_timestamp) AS latest_timestamp,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM ethereum.core.fact_blocks;
```

**Interpretation:**

* **Less than 1 hour behind**: Normal for Core tables
* **1-6 hours behind**: Normal for DeFi/NFT tables
* **Greater than 24 hours behind**: Potential issue, contact support

## Monitoring freshness across schemas

Monitor data freshness across multiple table types:

```sql theme={null}
-- Check freshness of different schemas
SELECT
    'Core: fact_blocks' AS table_name,
    MAX(block_timestamp) AS latest_data,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM ethereum.core.fact_blocks

UNION ALL

SELECT
    'Core: fact_transactions' AS table_name,
    MAX(block_timestamp) AS latest_data,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM ethereum.core.fact_transactions

UNION ALL

SELECT
    'DeFi: ez_dex_swaps' AS table_name,
    MAX(block_timestamp) AS latest_data,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM ethereum.defi.ez_dex_swaps

UNION ALL

SELECT
    'NFT: ez_nft_sales' AS table_name,
    MAX(block_timestamp) AS latest_data,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM ethereum.nft.ez_nft_sales

ORDER BY minutes_behind DESC;
```

## Cross-chain freshness comparison

Compare freshness across multiple blockchains:

```sql theme={null}
-- Compare freshness across your mounted shares
SELECT
    'Ethereum' AS chain,
    MAX(block_timestamp) AS latest_data,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM ETHEREUM_CORE.ethereum.core.fact_blocks

UNION ALL

SELECT
    'Arbitrum' AS chain,
    MAX(block_timestamp) AS latest_data,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM ARBITRUM_CORE.arbitrum.core.fact_blocks

UNION ALL

SELECT
    'Solana' AS chain,
    MAX(block_timestamp) AS latest_data,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM SOLANA_CORE.solana.core.fact_blocks

ORDER BY minutes_behind;
```

## Understanding timestamp fields

All Flipside tables include three timestamp fields with different purposes:

### `block_timestamp` (clustered)

The timestamp when the block was produced on the blockchain.

**Clustered:** ✅ Yes (primary clustering key)

**Use for:**

* Time-series analysis
* Historical queries
* Filtering by when events occurred on-chain
* **Always use this for performance** (see
  [Query Optimization](/snowflake-data-shares/query-optimization))

```sql theme={null}
-- Analyze transactions from last week (fast - uses clustering)
SELECT *
FROM ethereum.core.fact_transactions
WHERE block_timestamp >= CURRENT_DATE - 7;
```

### `_inserted_timestamp` (not clustered)

The timestamp when the row was first inserted into Flipside's database.

**Clustered:** ❌ No

**Use for:**

* Identifying newly added data
* Incremental data loads
* Monitoring ingestion progress

```sql theme={null}
-- Find data inserted in last hour
SELECT *
FROM ethereum.core.fact_transactions
WHERE _inserted_timestamp >= DATEADD(hour, -1, CURRENT_TIMESTAMP())
  AND block_timestamp >= CURRENT_DATE - 7;  -- Still include block_timestamp for performance
```

<Warning>
  **Performance tip:** Always include a `block_timestamp` filter even when filtering on
  `_inserted_timestamp` to leverage clustering.
</Warning>

### `_modified_timestamp` (not clustered)

The timestamp when the row was last modified (for updates or corrections).

**Clustered:** ❌ No

**Use for:**

* Detecting data corrections
* Change data capture (CDC)
* Audit trails

```sql theme={null}
-- Find rows modified recently (corrections/updates)
SELECT
    block_timestamp,
    tx_hash,
    _inserted_timestamp,
    _modified_timestamp
FROM ethereum.core.fact_transactions
WHERE _modified_timestamp >= DATEADD(hour, -24, CURRENT_TIMESTAMP())
  AND block_timestamp >= CURRENT_DATE - 30  -- Always filter by block_timestamp
ORDER BY _modified_timestamp DESC;
```

## Incremental data loading patterns

For building data pipelines that process only new data:

### Pattern 1: Using `_inserted_timestamp`

Best for tracking which rows are new in Flipside's database:

```sql theme={null}
-- Step 1: Create checkpoint table
CREATE OR REPLACE TABLE my_checkpoints (
    table_name VARCHAR,
    last_processed_timestamp TIMESTAMP_NTZ
);

-- Step 2: Initialize checkpoint (run once)
INSERT INTO my_checkpoints
VALUES ('fact_transactions', CURRENT_TIMESTAMP());

-- Step 3: Incremental load query
SELECT *
FROM ethereum.core.fact_transactions
WHERE _inserted_timestamp > (
    SELECT last_processed_timestamp
    FROM my_checkpoints
    WHERE table_name = 'fact_transactions'
)
  AND block_timestamp >= CURRENT_DATE - 30  -- Always include for performance
ORDER BY _inserted_timestamp;

-- Step 4: Update checkpoint after successful load
UPDATE my_checkpoints
SET last_processed_timestamp = CURRENT_TIMESTAMP()
WHERE table_name = 'fact_transactions';
```

### Pattern 2: Using `block_timestamp` (recommended)

Best for time-based incremental processing:

```sql theme={null}
-- Step 1: Create checkpoint table
CREATE OR REPLACE TABLE my_checkpoints (
    table_name VARCHAR,
    last_processed_block_time TIMESTAMP_NTZ
);

-- Step 2: Initialize checkpoint (run once)
INSERT INTO my_checkpoints
SELECT
    'fact_transactions' AS table_name,
    MAX(block_timestamp) AS last_processed_block_time
FROM my_existing_local_table;

-- Step 3: Incremental load with buffer for late data
SELECT *
FROM ethereum.core.fact_transactions
WHERE block_timestamp > (
    SELECT last_processed_block_time
    FROM my_checkpoints
    WHERE table_name = 'fact_transactions'
)
  AND block_timestamp <= CURRENT_TIMESTAMP() - INTERVAL '15 minutes'  -- Buffer for late-arriving data
ORDER BY block_timestamp;

-- Step 4: Update checkpoint after successful load
UPDATE my_checkpoints
SET last_processed_block_time = (
    SELECT MAX(block_timestamp)
    FROM my_existing_local_table
)
WHERE table_name = 'fact_transactions';
```

<Tip>
  **Best practice:** Use `block_timestamp` with a 10-15 minute buffer to account for late-arriving
  or reprocessed data due to chain reorgs.
</Tip>

### Pattern 3: Upsert with MERGE

For maintaining an up-to-date replica with corrections:

```sql theme={null}
-- Incremental upsert using MERGE
MERGE INTO my_local_transactions AS target
USING (
    SELECT *
    FROM ethereum.core.fact_transactions
    WHERE _modified_timestamp >= DATEADD(hour, -24, CURRENT_TIMESTAMP())
      AND block_timestamp >= CURRENT_DATE - 30
) AS source
ON target.tx_hash = source.tx_hash
WHEN MATCHED THEN
    UPDATE SET
        target.block_number = source.block_number,
        target.block_timestamp = source.block_timestamp,
        target.from_address = source.from_address,
        target.to_address = source.to_address,
        target.eth_value = source.eth_value,
        -- ... other columns
        target._modified_timestamp = source._modified_timestamp
WHEN NOT MATCHED THEN
    INSERT (tx_hash, block_number, block_timestamp, from_address, to_address, eth_value, _modified_timestamp)
    VALUES (source.tx_hash, source.block_number, source.block_timestamp, source.from_address, source.to_address, source.eth_value, source._modified_timestamp);
```

## Data quality validation

Flipside runs 40,000+ automated tests continuously, but you can also validate data yourself:

### Check for gaps in block data

```sql theme={null}
-- Identify missing blocks in sequence
WITH block_sequence AS (
    SELECT
        block_number,
        LAG(block_number) OVER (ORDER BY block_number) AS prev_block
    FROM ethereum.core.fact_blocks
    WHERE block_timestamp >= CURRENT_DATE - 1
)
SELECT
    prev_block AS last_block_before_gap,
    block_number AS first_block_after_gap,
    block_number - prev_block AS gap_size
FROM block_sequence
WHERE block_number - prev_block > 1
ORDER BY gap_size DESC;
```

### Validate transaction counts

```sql theme={null}
-- Verify transaction counts match between fact_blocks and fact_transactions
WITH block_counts AS (
    SELECT
        block_number,
        tx_count AS reported_tx_count
    FROM ethereum.core.fact_blocks
    WHERE block_timestamp >= CURRENT_DATE - 1
),
actual_counts AS (
    SELECT
        block_number,
        COUNT(*) AS actual_tx_count
    FROM ethereum.core.fact_transactions
    WHERE block_timestamp >= CURRENT_DATE - 1
    GROUP BY block_number
)
SELECT
    b.block_number,
    b.reported_tx_count,
    COALESCE(a.actual_tx_count, 0) AS actual_tx_count,
    b.reported_tx_count - COALESCE(a.actual_tx_count, 0) AS difference
FROM block_counts b
LEFT JOIN actual_counts a ON b.block_number = a.block_number
WHERE b.reported_tx_count != COALESCE(a.actual_tx_count, 0)
ORDER BY ABS(b.reported_tx_count - COALESCE(a.actual_tx_count, 0)) DESC
LIMIT 20;
```

## Automated monitoring with tasks

Set up Snowflake tasks to monitor freshness automatically:

```sql theme={null}
-- Create a task to check freshness hourly
CREATE OR REPLACE TASK monitor_ethereum_freshness
  WAREHOUSE = compute_wh
  SCHEDULE = 'USING CRON 0 * * * * America/Los_Angeles'  -- Every hour
AS
INSERT INTO freshness_alerts
SELECT
    CURRENT_TIMESTAMP() AS check_time,
    'ethereum.core.fact_blocks' AS table_name,
    MAX(block_timestamp) AS latest_block_time,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind,
    CASE
        WHEN DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) > 120
        THEN 'ALERT: Data is stale'
        ELSE 'OK'
    END AS status
FROM ethereum.core.fact_blocks;

-- Resume the task to start monitoring
ALTER TASK monitor_ethereum_freshness RESUME;
```

## When to contact support

Contact [data-shares@flipsidecrypto.com](mailto:data-shares@flipsidecrypto.com) if:

<AccordionGroup>
  <Accordion title="Data is significantly stale" icon="clock">
    * Core tables are greater than 6 hours behind chain head
    * DeFi/NFT tables are greater than 24 hours behind
    * Data hasn't updated in 48+ hours
  </Accordion>

  <Accordion title="Data quality issues" icon="triangle-exclamation">
    * Large gaps in block sequences - Missing transactions or events - Inconsistent transaction counts
      between tables
  </Accordion>

  <Accordion title="Unexpected changes" icon="shuffle">
    * Previously available data is now missing
    * Sudden changes in data structure or schema
    * Tables or schemas disappeared from your share
  </Accordion>
</AccordionGroup>

## Best practices for Snowflake

<CardGroup cols={2}>
  <Card title="Build time buffers" icon="shield">
    Don't query the last 10-15 minutes of data to account for processing latency and potential chain reorgs.
  </Card>

  <Card title="Use clustering" icon="bolt">
    Always filter by `block_timestamp` first to leverage Flipside's table clustering for maximum
    performance.
  </Card>

  <Card title="Monitor incrementally" icon="gauge">
    Set up Snowflake tasks to automatically check data freshness for your critical pipelines.
  </Card>

  <Card title="Handle late data" icon="clock-rotate-left">
    Use `block_timestamp` buffers in ETL pipelines to catch late-arriving data from chain reorgs or reprocessing.
  </Card>
</CardGroup>

## Next steps

<CardGroup cols={2}>
  <Card title="General freshness guide" icon="clock" href="/blockchain-data/data-freshness">
    See latency targets by schema and blockchain
  </Card>

  <Card title="Query optimization" icon="gauge-high" href="/snowflake-data-shares/query-optimization">
    Learn how clustering affects performance
  </Card>

  <Card title="Use cases" icon="lightbulb" href="/snowflake-data-shares/use-cases">
    See example queries with proper time filtering
  </Card>

  <Card title="Troubleshooting" icon="wrench" href="/snowflake-data-shares/troubleshooting">
    Fix common data freshness issues
  </Card>
</CardGroup>
