Logo

Critical Review: Experiment 5 (Conservation Laws)

Critical Review: Experiment 5 (Conservation Laws)

Issues Identified and Tested

1. Output Scaling Test

Hypothesis: Large output ranges (-121 to +128, std ≈ 35) might be causing learning difficulties.

Test: Applied StandardScaler to outputs before training.

Result: R² = 0.2799 (identical to without scaling)

Conclusion: Output scaling does NOT help. The problem is not about output scale.

2. Brightness Hyperparameter Test

Hypothesis: brightness=0.001 might not be optimal for this problem.

Test: Tested brightness values [0.0001, 0.001, 0.01, 0.1, 1.0]

Results:

  • brightness=0.0001: R² = 0.0255 ❌
  • brightness=0.001: R² = 0.2799 ✅ (best)
  • brightness=0.01: R² = -0.3846 ❌
  • brightness=0.1: R² = -1.7143 ❌
  • brightness=1.0: R² = -0.0484 ❌

Conclusion: brightness=0.001 is optimal. The hyperparameter is well-tuned.

3. Baseline Comparison

Test: Compared with polynomial baseline (degree 4)

Results:

  • Darwinian (Polynomial): R² = 0.9949 ✅
  • Chaos Model: R² = 0.2799 ❌

Conclusion: The problem IS learnable (baseline succeeds), but the chaos model fails. This is a genuine model limitation, not a problem design flaw.

4. Data Generation Validation

Test: Verified physics correctness

Results:

  • Momentum conservation error: mean = 1.67e-14 (perfect)
  • Energy conservation error: mean = 6.29e-13 (perfect)

Conclusion: Data generation is physically correct. No bugs in simulator.

5. Relationship Complexity Analysis

Formula: v1=(m1m2)v1+2m2v2m1+m2v'_1 = \frac{(m_1 - m_2)v_1 + 2m_2v_2}{m_1 + m_2}

Characteristics:

  • Non-linear (division by sum)
  • Involves interactions between all inputs
  • Output range depends on input combinations

Question: Is this too complex for the chaos model?

Answer: The polynomial baseline (degree 4) can learn it, so complexity alone is not the issue.

Root Cause Analysis

Why Does the Chaos Model Fail?

  1. Feature Saturation: Features are in [0, 0.5] range (tanh output), with mean ≈ 0.02, std ≈ 0.03. This is very small and may not capture enough information.

  2. Limited Expressiveness: The FFT transformation may not naturally encode division operations that are central to the collision formula.

  3. Ridge Regression Limitation: With 4096 features but only 3000 samples, Ridge regression may be underfitting or the features may not be informative enough.

  4. Comparison with Successful Experiments:

    • Experiment 1 (Ballistics): R² = 0.9999 ✅
    • Experiment 2 (Relativity): R² = 1.0000 ✅
    • Experiment 5 (Conservation): R² = 0.2799 ❌

    Key Difference: Experiments 1-2 involve multiplicative relationships (v², sin, sqrt), while Experiment 5 involves division. The chaos model may be better at multiplicative than divisive relationships.

Validated Conclusions

What We Know for Certain:

  1. Data is correct: Physics simulator works perfectly (conservation verified)
  2. Hyperparameters are optimal: brightness=0.001 is best
  3. Output scaling doesn't help: Not a scaling issue
  4. Problem is learnable: Baseline achieves R² = 0.99
  5. Chaos model genuinely fails: This is a real limitation, not a bug

What This Means:

The low R² = 0.28 is a genuine model limitation, not an experimental design flaw. The chaos model simply cannot learn division-based relationships as well as it learns multiplicative ones.

Recommendations

For Documentation:

  1. Acknowledge the limitation: The chaos model struggles with division operations
  2. Note the baseline success: The problem is learnable, just not by this architecture
  3. Compare with other experiments: Division vs. multiplication may be a key factor

For Future Work:

  1. Test with different architectures: Maybe a different chaos transformation would work
  2. Explicit feature engineering: Could add ratio features (m1/m2, etc.) to help
  3. Hybrid approaches: Combine chaos with explicit division features

Final Verdict

Status: ✅ EXPERIMENT IS VALID

The low performance (R² = 0.28) is a genuine finding about model limitations, not an experimental artifact. The experiment correctly identifies that:

  • The problem is learnable (baseline succeeds)
  • The chaos model fails (genuine limitation)
  • This may be due to difficulty with division operations

The experiment design is sound. The results are honest and meaningful.

© 2025 All rights reservedBuilt with DataHub Cloud

Built with LogoDataHub Cloud