Discovery Overview

Tripl-i Discovery automatically finds and maps your entire IT infrastructure, creating a real-time, accurate view of all hardware, software, and their relationships. Using multiple discovery methods and AI-powered analysis, it eliminates manual inventory processes and ensures your CMDB stays current.

What is Discovery?

Discovery is the automated process of:

🔍 Finding devices and applications on your network
📊 Collecting detailed configuration and state information
🔗 Mapping relationships and dependencies
🧠 Analyzing patterns and anomalies with AI
🔄 Updating the CMDB with current data

Discovery Architecture

Discovery Methods

🔌 Agent-Based Discovery

Advantages:

Deep system information
Real-time updates
Behind firewall access
Minimal network impact

Supported Platforms:

Windows (PowerShell agent)
Linux (Python agent)
Unix (Shell agent)
Container environments

📡 Agentless Discovery

Network Scanning:

SNMP v1/v2c/v3
WMI (Windows)
SSH (Linux/Unix)
ICMP ping sweep

API-Based:

VMware vSphere
AWS EC2
Azure Resource Manager
Google Cloud Platform
Kubernetes API

🔄 Hybrid Discovery

Combines agent and agentless methods for:

Complete coverage
Minimal blind spots
Optimized performance
Flexible deployment

Discovery Process

1. Initial Discovery

Phase 1 - Network Sweep:
  - IP range scanning
  - Port identification
  - Basic device classification
  - Initial inventory

Phase 2 - Deep Discovery:
  - Credential-based access
  - Detailed configuration
  - Software inventory
  - Process analysis

Phase 3 - Relationship Mapping:
  - Network connections
  - Application dependencies
  - Service mapping
  - Data flow analysis

2. Continuous Discovery

AI-Powered Discovery Features

Intelligent Device Classification

{
  "discovered_device": {
    "ip": "10.1.1.50",
    "open_ports": [22, 80, 443, 3306],
    "banner_info": "Apache/2.4.41 (Ubuntu)"
  },
  "ai_classification": {
    "type": "Web Server",
    "os": "Ubuntu Linux",
    "role": "Application Server",
    "services": ["Apache Web Server", "MySQL Database"],
    "confidence": 94
  }
}

Pattern Recognition

Identifies standard deployment patterns
Detects application stacks
Recognizes clustering configurations
Maps load-balanced services

Anomaly Detection

Anomaly Detected:
  Type: New Device
  IP: 10.1.5.200
  Classification: Unknown Web Server
  Risk: Medium
  Recommendation: 
    - Verify if authorized
    - Check security compliance
    - Update firewall rules

Discovery Credentials

Credential Management

Credential Vault:
  Windows Domain:
    Type: Active Directory
    Scope: *.corp.local
    Privileges: Read-only
    
  Linux Systems:
    Type: SSH Key
    Scope: Production Subnet
    Sudo: NOPASSWD for discovery
    
  Network Devices:
    Type: SNMP v3
    Scope: Network Infrastructure
    Security: authPriv

Security Best Practices

✅ Use read-only credentials
✅ Implement credential rotation
✅ Limit scope by IP range
✅ Monitor credential usage
✅ Encrypt credentials at rest

Discovery Scheduling

Schedule Types

Full Discovery

Complete infrastructure scan
All attributes collected
Relationship rebuild
Schedule: Weekly

Incremental Discovery

Changes only
New/modified devices
Quick updates
Schedule: Every 4 hours

Real-time Discovery

Agent-based changes
Immediate updates
Critical systems only
Schedule: Continuous

Smart Scheduling

{
  "smart_schedule": {
    "production_servers": {
      "method": "agent",
      "frequency": "real-time"
    },
    "development_servers": {
      "method": "agentless",
      "frequency": "daily"
    },
    "network_devices": {
      "method": "snmp",
      "frequency": "every_4_hours"
    },
    "workstations": {
      "method": "agent",
      "frequency": "on_login"
    }
  }
}

CI Matching and Reconciliation

Overview

The CI matching and reconciliation mechanism is critical for preventing duplicate Configuration Items during discovery. It uses a hierarchical lookup strategy to find existing CIs before creating new ones, ensuring data integrity and preventing duplication.

Matching Hierarchy

The system uses a priority-based matching approach to identify existing CIs:

Priority Order:
Serial Number + MAC Address  # Highest confidence - unique hardware
Serial Number Only           # High confidence - usually unique
MAC Address Only             # Medium confidence - can change
Hostname                     # Low confidence - can be duplicated
IP Address                   # Last resort - dynamic/reusable

Tenant Isolation

Critical Requirement: All CI lookups MUST include tenant filtering to ensure proper data isolation:

// Correct: Includes tenant in lookup
const existingCI = await CI.findOne({
  'customFields.serial_number': serialNumber,
  'type': deviceType,
  'tenant': scanData.tenant  // REQUIRED for isolation
});

// Incorrect: Missing tenant filter (causes duplicates)
const existingCI = await CI.findOne({
  'customFields.serial_number': serialNumber,
  'type': deviceType
  // Missing tenant filter - will create duplicates!
});

Matching Algorithm

Implementation Details

1. Hardware Identifier Priority

// Hardware identifiers take precedence
const hasHardwareIdentifiers = serialNumber || macAddress;

if (hasHardwareIdentifiers) {
  // Skip IP-based matching when hardware IDs exist
  // This prevents overwriting different physical devices
  // that happen to share an IP address
}

2. Conflict Detection

// Detect when same IP has different hardware
if (existingCI) {
  const existingSerial = existingCI.customFields.serial_number;
  const existingMAC = existingCI.customFields.mac_address;
  
  if (serialNumber !== existingSerial || macAddress !== existingMAC) {
    // IP conflict detected - different physical device
    // Create new CI instead of updating
    logger.warn('IP conflict: Creating new CI for different device');
  }
}

3. Virtual Machine Handling

Virtual machines require special handling due to:

Cloned VMs may share serial numbers
MAC addresses can be regenerated
VMware serial format: VMware-42 XX XX XX...

// VM Detection
const isVirtualMachine = serialNumber?.startsWith('VMware-') || 
                        serialNumber?.includes('Virtual');

if (isVirtualMachine) {
  // Use MAC address as primary identifier
  // Consider VM UUID if available
}

Common Matching Scenarios

Scenario 1: Hardware Refresh

Old Device:
  Serial: ABC123
  MAC: 00:11:22:33:44:55
  IP: 192.168.1.100

New Device:
  Serial: XYZ789  # Different
  MAC: AA:BB:CC:DD:EE:FF  # Different
  IP: 192.168.1.100  # Same (reused)

Result: Creates new CI (different hardware)

Scenario 2: Network Change

Before:
  Serial: ABC123
  MAC: 00:11:22:33:44:55
  IP: 192.168.1.100

After:
  Serial: ABC123  # Same
  MAC: 00:11:22:33:44:55  # Same
  IP: 10.0.0.50  # Different (moved)

Result: Updates existing CI (same hardware)

Scenario 3: MAC Address Change

Before:
  Serial: ABC123
  MAC: 00:11:22:33:44:55
  
After:
  Serial: ABC123  # Same
  MAC: AA:BB:CC:DD:EE:FF  # Different (NIC replaced)

Result: Updates existing CI (serial match)

Duplicate Prevention

Root Cause of Duplicates

Duplicates occur when:

Missing Tenant Filter: Lookups don't include tenant field
Timing Issues: Concurrent scans of same device
Identifier Changes: Hardware changes between scans
Data Quality: Missing or invalid identifiers

Prevention Strategies

// 1. Always include tenant in lookups
const lookupQuery = {
  $and: [
    { 'customFields.serial_number': serialNumber },
    { 'type': deviceType },
    { 'tenant': scanData.tenant }  // Critical!
  ]
};

// 2. Use transactions for atomic operations
const session = await mongoose.startSession();
await session.withTransaction(async () => {
  const existingCI = await CI.findOne(lookupQuery).session(session);
  if (!existingCI) {
    await CI.create([ciData], { session });
  }
});

// 3. Implement retry logic for conflicts
const maxRetries = 3;
for (let i = 0; i < maxRetries; i++) {
  try {
    await processCI(scanData);
    break;
  } catch (error) {
    if (error.code === 11000 && i < maxRetries - 1) {
      // Duplicate key error - retry
      await sleep(100 * Math.pow(2, i));
    } else {
      throw error;
    }
  }
}

Troubleshooting Duplicates

Finding Duplicates

// Script to identify duplicate CIs
const duplicates = await CI.aggregate([
  {
    $match: {
      'customFields.serial_number': { $exists: true, $ne: '' }
    }
  },
  {
    $group: {
      _id: {
        serial: '$customFields.serial_number',
        tenant: '$tenant'
      },
      count: { $sum: 1 },
      ids: { $push: '$_id' }
    }
  },
  {
    $match: { count: { $gt: 1 } }
  }
]);

Resolution Steps

Identify duplicates using the script above
Compare last_scan timestamps to find most recent
Merge custom fields and relationships
Update references in related documents
Delete older duplicate CIs
Verify no orphaned relationships remain

Best Practices

Always Include Tenant: Every CI lookup must filter by tenant
Use Hardware IDs: Prioritize serial/MAC over IP address
Handle Conflicts: Detect and log identifier mismatches
Monitor Duplicates: Regular audits for duplicate detection
Test Thoroughly: Verify matching logic with edge cases

Discovery Data Processing

Data Flow Pipeline

Data Quality Controls

Validation Rules:

Required field checks
Format validation
Range verification
Consistency checks

Deduplication Logic:

Serial number matching
MAC address correlation
Hostname resolution
UUID comparison

Performance & Scalability

Discovery Metrics

Performance Targets:
  Devices per Hour: 10,000
  Concurrent Scans: 500
  Data Processing: 1M attributes/min
  CMDB Updates: 50,000/min
  
Current Performance:
  Network Utilization: 12%
  CPU Usage: 35%
  Memory Usage: 4.2 GB
  Queue Depth: 127 devices

Scaling Strategies

Horizontal Scaling:

Multiple discovery engines
Distributed processing
Load balancing
Regional collectors

Optimization Techniques:

Parallel scanning
Batch processing
Caching mechanisms
Smart scheduling

Discovery Reporting

Discovery Dashboard

{
  "discovery_summary": {
    "total_devices": 15847,
    "discovered_today": 234,
    "failed_discoveries": 12,
    "success_rate": "98.7%",
    "coverage": {
      "servers": "100%",
      "workstations": "94%",
      "network": "100%",
      "cloud": "87%"
    }
  }
}

Key Reports

Discovery Coverage
- Discovered vs expected
- Blind spots analysis
- Credential failures
- Network unreachable
Discovery Performance
- Scan duration trends
- Success/failure rates
- Resource utilization
- Queue statistics
New Device Report
- Newly discovered items
- Unauthorized devices
- Shadow IT detection
- Compliance gaps

Troubleshooting

Common Issues

Duplicate CIs Created:

Symptom: Multiple CIs for same device
Causes:
  - Missing tenant filter in lookups
  - Concurrent scan processing
  - Changed hardware identifiers
  - IP address reuse
  
Resolution:
  - Verify tenant filtering in processors
  - Run duplicate detection script
  - Merge duplicate CIs carefully
  - Update matching logic
  
Prevention:
  - Always include tenant in queries
  - Use hardware IDs for matching
  - Implement proper error handling
  - Monitor for duplicates regularly

Discovery Failures:

Symptom: Devices not discovered
Causes:
  - Network connectivity
  - Firewall blocking
  - Invalid credentials
  - Service disabled
  
Resolution:
  - Check port access
  - Verify credentials
  - Review firewall logs
  - Enable services

Performance Issues:

Symptom: Slow discovery
Causes:
  - Network congestion
  - Overloaded targets
  - Large scan ranges
  - Insufficient resources
  
Resolution:
  - Adjust scheduling
  - Limit concurrent scans
  - Increase resources
  - Optimize queries

CI Matching Failures:

Symptom: CIs not updating, creating new instead
Causes:
  - Missing serial numbers
  - Changed MAC addresses
  - Dynamic IP assignment
  - Tenant mismatch
  
Resolution:
  - Verify hardware identifiers present
  - Check tenant assignment
  - Review matching hierarchy
  - Update lookup logic if needed

Best Practices

1. Planning

✅ Map network topology first
✅ Identify critical systems
✅ Plan discovery phases
✅ Set realistic schedules

2. Implementation

✅ Start with small pilot
✅ Validate discovered data
✅ Tune discovery patterns
✅ Monitor performance

3. Optimization

✅ Regular schedule review
✅ Credential maintenance
✅ Performance tuning
✅ Coverage analysis

4. Governance

✅ Discovery approval process
✅ Change notification
✅ Compliance validation
✅ Regular audits

Next Steps

📖 Network Scanning - Detailed network discovery
📖 Agent Deployment - Installing discovery agents
📖 Credential Management - Managing discovery credentials
📖 Discovery Patterns - Custom discovery patterns

What is Discovery?​

Discovery Architecture​

Discovery Methods​

🔌 Agent-Based Discovery​

📡 Agentless Discovery​

🔄 Hybrid Discovery​

Discovery Process​

1. Initial Discovery​

2. Continuous Discovery​

AI-Powered Discovery Features​

Intelligent Device Classification​

Pattern Recognition​

Anomaly Detection​

Discovery Credentials​

Credential Management​

Security Best Practices​

Discovery Scheduling​

Schedule Types​

Smart Scheduling​

CI Matching and Reconciliation​

Overview​

Matching Hierarchy​

Tenant Isolation​

Matching Algorithm​

Implementation Details​

1. Hardware Identifier Priority​

2. Conflict Detection​

3. Virtual Machine Handling​

Common Matching Scenarios​

Scenario 1: Hardware Refresh​

Scenario 2: Network Change​

Scenario 3: MAC Address Change​

Duplicate Prevention​

Root Cause of Duplicates​

Prevention Strategies​

Troubleshooting Duplicates​

Finding Duplicates​

Resolution Steps​

Best Practices​

Discovery Data Processing​

Data Flow Pipeline​

Data Quality Controls​

Performance & Scalability​

Discovery Metrics​

Scaling Strategies​

Discovery Reporting​

Discovery Dashboard​

Key Reports​

Troubleshooting​

Common Issues​

Best Practices​

1. Planning​

2. Implementation​

3. Optimization​

4. Governance​

Next Steps​

What is Discovery?

Discovery Architecture

Discovery Methods

🔌 Agent-Based Discovery

📡 Agentless Discovery

🔄 Hybrid Discovery

Discovery Process

1. Initial Discovery

2. Continuous Discovery

AI-Powered Discovery Features

Intelligent Device Classification

Pattern Recognition

Anomaly Detection

Discovery Credentials

Credential Management

Security Best Practices

Discovery Scheduling

Schedule Types

Smart Scheduling

CI Matching and Reconciliation

Overview

Matching Hierarchy

Tenant Isolation

Matching Algorithm

Implementation Details

1. Hardware Identifier Priority

2. Conflict Detection

3. Virtual Machine Handling

Common Matching Scenarios

Scenario 1: Hardware Refresh

Scenario 2: Network Change

Scenario 3: MAC Address Change

Duplicate Prevention

Root Cause of Duplicates

Prevention Strategies

Troubleshooting Duplicates

Finding Duplicates

Resolution Steps

Best Practices

Discovery Data Processing

Data Flow Pipeline

Data Quality Controls

Performance & Scalability

Discovery Metrics

Scaling Strategies

Discovery Reporting

Discovery Dashboard

Key Reports

Troubleshooting

Common Issues

Best Practices

1. Planning

2. Implementation

3. Optimization

4. Governance

Next Steps