OAK

OAK National Repository

HSU Repository 창의융합대학 창의융합대학 AI응용학과 1. Journal Articles

경험 정보 기반 지능형 참여자 선택을 통한 연합 강화학습 개선

Metadata Downloads

Abstract: 연합 강화학습(Federated Reinforcement Learning, FRL)은 분산 환경에서 데이터 프라이버시를 보존하면서도 효과적인 강화학습을 수행할 수 있는 혁신적인 학습 기법이다. 연합학습(Federated Learning, FL)에서 참여자 선택은 시스템의 전반적인 성능과 학습 효율성에 큰 영향을 미치기 때문에 특히 중요하다. 그러나 기존 FL과 달리 FRL에서는 사전에 축적된 데이터셋이 없어 FL 참여자 선택 방법들을 FRL 시스템에 적용하기 어렵다.
이러한 한계를 해결하기 위해, 본 논문에서는 FRL 시스템에 특화된 두 가지 참여자 선택 기법을 제안한다.

첫째, 경험 기반 참여자 선택 방법을 제안하였다. 이 접근법은 차원 축소 기법을 사용하여 강화학습 Agent의 경험을 효과적으로 분석하고, 유사도 측정을 활용하여 다양한 경험을 가진 Agent들을 지능적으로 선택한다. 그 결과, FRL 시스템의 더욱 효율적인 학습과 전반적인 성능 향상을 가져왔다. 이는 기존 방법 대비 평균 훈련 시간의 23.65% 감소와 평균 53.25%의 성능 향상으로 입증되었다.

둘째, CLaPS(Contrastively trained Language model-based Participant Selection)를 제안한다. 기존의 참여자 선택 기법들은 높은 계산 복잡도, 과도한 메모리 요구사항, 정보 손실 등의 문제로 인해 자원이 제한된 분산 환경에서의 적용이 어렵다는 한계가 있다. CLaPS는 FRL에 최적화된 Contrastively trained Transformer Encoder를 활용하여 Agent 경험의 핵심 특징을 효과적으로 추출하고, 이를 기반으로 다양한 경험을 가진 참여자를 선택한다. 실험 결과, CLaPS는 기존 기법 대비 학습 속도를 최대 43.8% 향상시키고 계산 시간을 65.7% 절감하였다. 이러한 우수한 학습 성능과 자원 효율성의 동시 개선으로 인해 제안 방법은 고성능을 요구하는 복잡한 환경부터 자원이 제한된 Edge Computing 환경에 이르기까지 광범위한 FRL 시나리오의 적용 가능성을 크게 향상시킨다.

[주요어] 연합 강화학습, 참여자 선택, 차원축소, 대조학습, 트랜스포머, 엣지 컴퓨팅|Federated Reinforcement Learning (FRL) is driving progress in the Edge AI sector by enabling distributed learning while preserving data privacy. Participant selection in Federated Learning (FL) is particularly crucial as it significantly influences the overall system performance and learning efficiency. However, unlike traditional FL, the absence of pre-accumulated datasets in FRL makes it challenging to apply existing FL participant selection methods to FRL systems.
To address these limitations, this thesis proposes two specialized participant selection techniques for FRL systems.

First, we introduce an experience-based participant selection method that effectively analyzes reinforcement learning agents' experiences using dimensionality reduction techniques and intelligently selects agents with diverse experiences through similarity measurements. This approach resulted in more efficient learning and overall performance improvement in FRL systems, demonstrated by a 23.65% reduction in average training time and a 53.25% average performance improvement compared to existing methods.

Second, we propose CLaPS (Contrastively trained Language model-based Participant Selection). Existing participant selection techniques face limitations in resource-constrained distributed environments due to high computational complexity, excessive memory requirements, and information loss. CLaPS leverages a Contrastively trained Transformer Encoder optimized for FRL to effectively extract key features from agent experiences and select participants with diverse experiences based on these features. Experimental results show that CLaPS improves learning speed by up to 43.8% and reduces computational time by 65.7% compared to existing approaches. The simultaneous improvement in both superior learning performance and resource efficiency significantly enhances the applicability of the proposed method across a wide range of FRL scenarios, from complex environments requiring high performance to resource-constrained Edge Computing environments.

[Keywords] Federated Reinforcement Learning, Participant Selection, Dimensionality Reduction, Contrastive Learning, Transformer, Edge Computing