In order to perform fault diagnosis under the condition of only having health status data, an optimized Swin Transformer deep neural network architecture is constructed to extract and reconstruct the features of health data, and an unsupervised learning method for rolling bearing fault diagnosis is proposed. Compared with autoencoders, deep encoders, convolutional autoencoders, and sparse autoencoders, the accuracy is 98.62%, 76.46%, 68.69%, 77.69%, and 68.00%, respectively. Compared with the comparison network, the accuracy is improved by more than 20%.