On model-free reinforcement learning for switched linear systems: A subspace clustering approach