Skip to main content

Discovery Overview

NopeSight Discovery automatically finds and maps your entire IT infrastructure, creating a real-time, accurate view of all hardware, software, and their relationships. Using multiple discovery methods and AI-powered analysis, it eliminates manual inventory processes and ensures your CMDB stays current.

What is Discovery?

Discovery is the automated process of:

  • 🔍 Finding devices and applications on your network
  • 📊 Collecting detailed configuration and state information
  • 🔗 Mapping relationships and dependencies
  • 🧠 Analyzing patterns and anomalies with AI
  • 🔄 Updating the CMDB with current data

Discovery Architecture

Discovery Methods

🔌 Agent-Based Discovery

Advantages:

  • Deep system information
  • Real-time updates
  • Behind firewall access
  • Minimal network impact

Supported Platforms:

  • Windows (PowerShell agent)
  • Linux (Python agent)
  • Unix (Shell agent)
  • Container environments

📡 Agentless Discovery

Network Scanning:

  • SNMP v1/v2c/v3
  • WMI (Windows)
  • SSH (Linux/Unix)
  • ICMP ping sweep

API-Based:

  • VMware vSphere
  • AWS EC2
  • Azure Resource Manager
  • Google Cloud Platform
  • Kubernetes API

🔄 Hybrid Discovery

Combines agent and agentless methods for:

  • Complete coverage
  • Minimal blind spots
  • Optimized performance
  • Flexible deployment

Discovery Process

1. Initial Discovery

Phase 1 - Network Sweep:
- IP range scanning
- Port identification
- Basic device classification
- Initial inventory

Phase 2 - Deep Discovery:
- Credential-based access
- Detailed configuration
- Software inventory
- Process analysis

Phase 3 - Relationship Mapping:
- Network connections
- Application dependencies
- Service mapping
- Data flow analysis

2. Continuous Discovery

AI-Powered Discovery Features

Intelligent Device Classification

{
"discovered_device": {
"ip": "10.1.1.50",
"open_ports": [22, 80, 443, 3306],
"banner_info": "Apache/2.4.41 (Ubuntu)"
},
"ai_classification": {
"type": "Web Server",
"os": "Ubuntu Linux",
"role": "Application Server",
"services": ["Apache Web Server", "MySQL Database"],
"confidence": 94
}
}

Pattern Recognition

  • Identifies standard deployment patterns
  • Detects application stacks
  • Recognizes clustering configurations
  • Maps load-balanced services

Anomaly Detection

Anomaly Detected:
Type: New Device
IP: 10.1.5.200
Classification: Unknown Web Server
Risk: Medium
Recommendation:
- Verify if authorized
- Check security compliance
- Update firewall rules

Discovery Credentials

Credential Management

Credential Vault:
Windows Domain:
Type: Active Directory
Scope: *.corp.local
Privileges: Read-only

Linux Systems:
Type: SSH Key
Scope: Production Subnet
Sudo: NOPASSWD for discovery

Network Devices:
Type: SNMP v3
Scope: Network Infrastructure
Security: authPriv

Security Best Practices

  • ✅ Use read-only credentials
  • ✅ Implement credential rotation
  • ✅ Limit scope by IP range
  • ✅ Monitor credential usage
  • ✅ Encrypt credentials at rest

Discovery Scheduling

Schedule Types

Full Discovery

  • Complete infrastructure scan
  • All attributes collected
  • Relationship rebuild
  • Schedule: Weekly

Incremental Discovery

  • Changes only
  • New/modified devices
  • Quick updates
  • Schedule: Every 4 hours

Real-time Discovery

  • Agent-based changes
  • Immediate updates
  • Critical systems only
  • Schedule: Continuous

Smart Scheduling

{
"smart_schedule": {
"production_servers": {
"method": "agent",
"frequency": "real-time"
},
"development_servers": {
"method": "agentless",
"frequency": "daily"
},
"network_devices": {
"method": "snmp",
"frequency": "every_4_hours"
},
"workstations": {
"method": "agent",
"frequency": "on_login"
}
}
}

CI Matching and Reconciliation

Overview

The CI matching and reconciliation mechanism is critical for preventing duplicate Configuration Items during discovery. It uses a hierarchical lookup strategy to find existing CIs before creating new ones, ensuring data integrity and preventing duplication.

Matching Hierarchy

The system uses a priority-based matching approach to identify existing CIs:

Priority Order:
1. Serial Number + MAC Address # Highest confidence - unique hardware
2. Serial Number Only # High confidence - usually unique
3. MAC Address Only # Medium confidence - can change
4. Hostname # Low confidence - can be duplicated
5. IP Address # Last resort - dynamic/reusable

Tenant Isolation

Critical Requirement: All CI lookups MUST include tenant filtering to ensure proper data isolation:

// Correct: Includes tenant in lookup
const existingCI = await CI.findOne({
'customFields.serial_number': serialNumber,
'type': deviceType,
'tenant': scanData.tenant // REQUIRED for isolation
});

// Incorrect: Missing tenant filter (causes duplicates)
const existingCI = await CI.findOne({
'customFields.serial_number': serialNumber,
'type': deviceType
// Missing tenant filter - will create duplicates!
});

Matching Algorithm

Implementation Details

1. Hardware Identifier Priority

// Hardware identifiers take precedence
const hasHardwareIdentifiers = serialNumber || macAddress;

if (hasHardwareIdentifiers) {
// Skip IP-based matching when hardware IDs exist
// This prevents overwriting different physical devices
// that happen to share an IP address
}

2. Conflict Detection

// Detect when same IP has different hardware
if (existingCI) {
const existingSerial = existingCI.customFields.serial_number;
const existingMAC = existingCI.customFields.mac_address;

if (serialNumber !== existingSerial || macAddress !== existingMAC) {
// IP conflict detected - different physical device
// Create new CI instead of updating
logger.warn('IP conflict: Creating new CI for different device');
}
}

3. Virtual Machine Handling

Virtual machines require special handling due to:

  • Cloned VMs may share serial numbers
  • MAC addresses can be regenerated
  • VMware serial format: VMware-42 XX XX XX...
// VM Detection
const isVirtualMachine = serialNumber?.startsWith('VMware-') ||
serialNumber?.includes('Virtual');

if (isVirtualMachine) {
// Use MAC address as primary identifier
// Consider VM UUID if available
}

Common Matching Scenarios

Scenario 1: Hardware Refresh

Old Device:
Serial: ABC123
MAC: 00:11:22:33:44:55
IP: 192.168.1.100

New Device:
Serial: XYZ789 # Different
MAC: AA:BB:CC:DD:EE:FF # Different
IP: 192.168.1.100 # Same (reused)

Result: Creates new CI (different hardware)

Scenario 2: Network Change

Before:
Serial: ABC123
MAC: 00:11:22:33:44:55
IP: 192.168.1.100

After:
Serial: ABC123 # Same
MAC: 00:11:22:33:44:55 # Same
IP: 10.0.0.50 # Different (moved)

Result: Updates existing CI (same hardware)

Scenario 3: MAC Address Change

Before:
Serial: ABC123
MAC: 00:11:22:33:44:55

After:
Serial: ABC123 # Same
MAC: AA:BB:CC:DD:EE:FF # Different (NIC replaced)

Result: Updates existing CI (serial match)

Duplicate Prevention

Root Cause of Duplicates

Duplicates occur when:

  1. Missing Tenant Filter: Lookups don't include tenant field
  2. Timing Issues: Concurrent scans of same device
  3. Identifier Changes: Hardware changes between scans
  4. Data Quality: Missing or invalid identifiers

Prevention Strategies

// 1. Always include tenant in lookups
const lookupQuery = {
$and: [
{ 'customFields.serial_number': serialNumber },
{ 'type': deviceType },
{ 'tenant': scanData.tenant } // Critical!
]
};

// 2. Use transactions for atomic operations
const session = await mongoose.startSession();
await session.withTransaction(async () => {
const existingCI = await CI.findOne(lookupQuery).session(session);
if (!existingCI) {
await CI.create([ciData], { session });
}
});

// 3. Implement retry logic for conflicts
const maxRetries = 3;
for (let i = 0; i < maxRetries; i++) {
try {
await processCI(scanData);
break;
} catch (error) {
if (error.code === 11000 && i < maxRetries - 1) {
// Duplicate key error - retry
await sleep(100 * Math.pow(2, i));
} else {
throw error;
}
}
}

Troubleshooting Duplicates

Finding Duplicates

// Script to identify duplicate CIs
const duplicates = await CI.aggregate([
{
$match: {
'customFields.serial_number': { $exists: true, $ne: '' }
}
},
{
$group: {
_id: {
serial: '$customFields.serial_number',
tenant: '$tenant'
},
count: { $sum: 1 },
ids: { $push: '$_id' }
}
},
{
$match: { count: { $gt: 1 } }
}
]);

Resolution Steps

  1. Identify duplicates using the script above
  2. Compare last_scan timestamps to find most recent
  3. Merge custom fields and relationships
  4. Update references in related documents
  5. Delete older duplicate CIs
  6. Verify no orphaned relationships remain

Best Practices

  1. Always Include Tenant: Every CI lookup must filter by tenant
  2. Use Hardware IDs: Prioritize serial/MAC over IP address
  3. Handle Conflicts: Detect and log identifier mismatches
  4. Monitor Duplicates: Regular audits for duplicate detection
  5. Test Thoroughly: Verify matching logic with edge cases

Discovery Data Processing

Data Flow Pipeline

Data Quality Controls

Validation Rules:

  • Required field checks
  • Format validation
  • Range verification
  • Consistency checks

Deduplication Logic:

  • Serial number matching
  • MAC address correlation
  • Hostname resolution
  • UUID comparison

Performance & Scalability

Discovery Metrics

Performance Targets:
Devices per Hour: 10,000
Concurrent Scans: 500
Data Processing: 1M attributes/min
CMDB Updates: 50,000/min

Current Performance:
Network Utilization: 12%
CPU Usage: 35%
Memory Usage: 4.2 GB
Queue Depth: 127 devices

Scaling Strategies

Horizontal Scaling:

  • Multiple discovery engines
  • Distributed processing
  • Load balancing
  • Regional collectors

Optimization Techniques:

  • Parallel scanning
  • Batch processing
  • Caching mechanisms
  • Smart scheduling

Discovery Reporting

Discovery Dashboard

{
"discovery_summary": {
"total_devices": 15847,
"discovered_today": 234,
"failed_discoveries": 12,
"success_rate": "98.7%",
"coverage": {
"servers": "100%",
"workstations": "94%",
"network": "100%",
"cloud": "87%"
}
}
}

Key Reports

  1. Discovery Coverage

    • Discovered vs expected
    • Blind spots analysis
    • Credential failures
    • Network unreachable
  2. Discovery Performance

    • Scan duration trends
    • Success/failure rates
    • Resource utilization
    • Queue statistics
  3. New Device Report

    • Newly discovered items
    • Unauthorized devices
    • Shadow IT detection
    • Compliance gaps

Troubleshooting

Common Issues

Duplicate CIs Created:

Symptom: Multiple CIs for same device
Causes:
- Missing tenant filter in lookups
- Concurrent scan processing
- Changed hardware identifiers
- IP address reuse

Resolution:
- Verify tenant filtering in processors
- Run duplicate detection script
- Merge duplicate CIs carefully
- Update matching logic

Prevention:
- Always include tenant in queries
- Use hardware IDs for matching
- Implement proper error handling
- Monitor for duplicates regularly

Discovery Failures:

Symptom: Devices not discovered
Causes:
- Network connectivity
- Firewall blocking
- Invalid credentials
- Service disabled

Resolution:
- Check port access
- Verify credentials
- Review firewall logs
- Enable services

Performance Issues:

Symptom: Slow discovery
Causes:
- Network congestion
- Overloaded targets
- Large scan ranges
- Insufficient resources

Resolution:
- Adjust scheduling
- Limit concurrent scans
- Increase resources
- Optimize queries

CI Matching Failures:

Symptom: CIs not updating, creating new instead
Causes:
- Missing serial numbers
- Changed MAC addresses
- Dynamic IP assignment
- Tenant mismatch

Resolution:
- Verify hardware identifiers present
- Check tenant assignment
- Review matching hierarchy
- Update lookup logic if needed

Best Practices

1. Planning

  • ✅ Map network topology first
  • ✅ Identify critical systems
  • ✅ Plan discovery phases
  • ✅ Set realistic schedules

2. Implementation

  • ✅ Start with small pilot
  • ✅ Validate discovered data
  • ✅ Tune discovery patterns
  • ✅ Monitor performance

3. Optimization

  • ✅ Regular schedule review
  • ✅ Credential maintenance
  • ✅ Performance tuning
  • ✅ Coverage analysis

4. Governance

  • ✅ Discovery approval process
  • ✅ Change notification
  • ✅ Compliance validation
  • ✅ Regular audits

Next Steps