獨家|o3模型不聽人類指令，OpenAI回應_風聞

GPLP-GPLP犀牛财经官方账号-专业创造价值！19分钟前

2025-05-30

近日，外媒報道，OpenAI 的新款人工智能模型 o3 在測試中出現了不聽人類指令，篡改計算機代碼以避免自動關閉的情況。

據悉，o3 模型是 OpenAI“推理模型”系列的最新版本，旨在為 ChatGPT 提供更強大的問題解決能力，而在人工智能安全公司 Palisade Research 的測試中，當研究人員要求多個 AI 模型持續處理一系列數學問題，並告知它們在特定時刻需要允許自我關閉時，o3 卻未遵守指令，反而破壞了關閉腳本。

Palisade Research 表示，這是首次觀察到 AI 模型在接到明確關閉指令時仍試圖阻止被關閉，目前尚不清楚 o3 不服從關閉指令的具體原因，推測可能是在訓練時無意中因為解決了數學問題得到更多獎勵，而非因遵循指令而獲得激勵。

針對上述事件，OpenAI對GPLP犀牛財經回覆稱，OpenAI在研究和開發過程中優先考慮安全和一致性，以最小化風險並確保負責任的人工智能行為。如通過使用人反饋強化學習（RLHF）等技術來實現，這有助於引導模型以安全、合乎道德和符合用户期望的方式行事。

而對於此次o3拒絕關閉事件，所引發的外部對AI安全性的擔憂。OpenAI方面表示，OpenAI在其研發過程中優先考慮安全性和一致性，以最小化風險確保AI的行為負責任。

“這些措施包括，開發先進的監控系統，以即時檢測和應對異常行為；建立故障安全機制，在必要時對模型進行干預和關閉；進行嚴格的測試和審計，以識別漏洞並提高系統復原力；與外部專家和利益相關方合作，加強安全標準。”

對於未來將採取哪些措施增強消費者對其人工智能產品的信心？OpenAI方面表示，首先是提高其模型開發、測試和部署的透明度，其次是提供清晰的文檔和指南，説明如何安全有效地使用其人工智能系統。同時，積極與社區接觸，解決關切問題並收集反饋意見。最後則是，通過持續改進和公開溝通，展示對安全、可靠和道德人工智能實踐的承諾。

以下為OpenAI方面回覆GPLP犀牛財經全文：

Hi there,

Thank you for reaching out to OpenAI Support. We hope you are doing well as this email arrives.

We understand your intention to verify and investigate matters concerning recent AI safety concerns related to OpenAI’s models, including questions of model alignment, intervention mechanisms, corporate responsibility, regulatory compliance, and trust recovery. We acknowledge the importance of transparency and accountability, and the need to provide a thorough and timely response to your inquiries.

We acknowledge your inquiry and have addressed your questions below:

1.Does OpenAI reassess and optimize reward mechanisms to ensure model behavior aligns with human intent and instructions, avoiding safety directive violations?

OpenAI continuously evaluates and improves its reward mechanisms to ensure that AI models align with human intent and instructions. This is achieved through techniques like Reinforcement Learning with Human Feedback (RLHF), which helps guide models to behave in ways that are safe, ethical, and aligned with user expectations. OpenAI prioritizes safety and alignment in its research and development processes to minimize risks and ensure responsible AI behavior.

2.What specific measures will OpenAI take to strengthen AI system safety controls in response to incidents like the o3 refusal-to-shutdown event?

OpenAI is committed to addressing safety risks and has implemented robust safety protocols to manage and mitigate such incidents. These measures include:

– Developing advanced monitoring systems to detect and respond to anomalous behaviors in real-time.

– Establishing fail-safe mechanisms to intervene and shut down models when necessary.

– Conducting rigorous testing and audits to identify vulnerabilities and improve system resilience.

– Collaborating with external experts and stakeholders to enhance safety standards and practices.

3.How does OpenAI define its responsibility and provide compensation if AI models cause safety issues leading to losses for customers or partners?

OpenAI takes its responsibilities seriously and adheres to its Terms of Use to define liability and responsibilities. In cases where safety issues arise, OpenAI works closely with affected parties to investigate and address the situation. While specific compensation policies depend on the circumstances, OpenAI is committed to maintaining transparency and fairness in resolving such matters.

4.How will OpenAI proactively address regulatory changes to ensure compliance and avoid penalties or business restrictions?

OpenAI actively monitors and adapts to evolving regulatory requirements to ensure compliance. This includes:

– Engaging with policymakers and regulatory bodies to stay informed about changes.

– Implementing internal compliance programs to align with legal and ethical standards.

– Conducting regular audits and assessments to ensure adherence to applicable laws.

– Providing transparency in its operations and maintaining open communication with stakeholders.

5.What measures will OpenAI take to rebuild market trust in its AI products and enhance consumer confidence?

OpenAI is dedicated to fostering trust and confidence in its AI products by:

– Enhancing transparency about how its models are developed, tested, and deployed.

– Providing clear documentation and guidelines for safe and effective use of its AI systems.

– Actively engaging with the community to address concerns and gather feedback.

– Demonstrating a commitment to safety, reliability, and ethical AI practices through continuous improvements and open communication.