dataframe contains and operation 여러 조건을 만족하는 데이터 필터링

Notice

Recent Posts

Recent Comments

Link

« 2025/02 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

Tags more

Archives

Today

Total

관리 메뉴

콘솔워크

dataframe contains and operation 여러 조건을 만족하는 데이터 필터링 본문

프로그래밍/python

dataframe contains and operation 여러 조건을 만족하는 데이터 필터링

이휘재123 2022. 10. 5. 09:02

데이터 프레임의 특정 열에 아래와 같은 문장이 들어있다.

"apple is delicious"
"banana is delicious"
"apple and banana both are delicious"

지금까지 사용했던 필터링 방식인 contains와 '|' 연산자를 사용한다면 아래와 같은 결과가 나온다.

df.col_name.str.contains("apple|banana")

# 결과
"apple is delicious",
"banana is delicious",
"apple and banana both are delicious"

이번 상황에서는 마지막줄에 있는 "apple and banana both are delicious"만을 필터링 하고 싶은데 이런 상황에서 사용 할 수 있는 몇가지 방법이 있다.

df[(df['col_name'].str.contains('apple')) & (df['col_name'].str.contains('banana'))]

# 결과
"apple and banana both are delicious"

df[df['col_name'].str.contains(r'^(?=.*apple)(?=.*banana)')]

# 결과
"apple and banana both are delicious"





# 위의 조건이 길어지면 가독성이 떨어지기 때문에 아래의 방법을 추천
base = r'^{}'
expr = '(?=.*{})'
words = ['apple', 'banana', 'cat']  # example
base.format(''.join(expr.format(w) for w in words))

# 결과
'^(?=.*apple)(?=.*banana)(?=.*cat)'

# base.format(''.join(expr.format(w) for w in words))을 위의 df조건에 넣으면 된다.

참고

https://stackoverflow.com/questions/37011734/pandas-dataframe-str-contains-and-operation

저작자표시 비영리 변경금지

'프로그래밍 > python' 카테고리의 다른 글

xlsx 파일을 xls로 변환 (0)	2022.10.12
따옴표 안의 문자열을 추출하는 정규식 (0)	2022.10.07
문자열에서 숫자만 가져오는 여러 방법 (0)	2022.09.29
현재 날짜 가져오기 및 특정 일자 더하기 (0)	2022.09.28
pandas.Series.str.match 특정 단어와 완전 일치하는 데이터 필터링 (0)	2022.09.27

'프로그래밍/python' Related Articles

콘솔워크

dataframe contains and operation 여러 조건을 만족하는 데이터 필터링 본문

dataframe contains and operation 여러 조건을 만족하는 데이터 필터링

'프로그래밍 > python' 카테고리의 다른 글

티스토리툴바