Stabilizing the Q-Gradient Field for Policy Smoothness in Actor-Critic | Signal Canvas | ScienceToStartup