대학교 수강신청 준비하기

In [10]:

from IPython.core.display import display, HTML
display(HTML("<style>.container { width:1200px !important; }</style>"))

df.loc[조건1 & 조건2, “column이름”] = “변경 값”

Series.value_counts()의 리턴 값은 Series 입니다.

또한, Series.index 를 사용하면 Series의 index 값을 리턴합니다.

list(Series.index)를 하면 Series의 index 값들을 파이썬 리스트로 만들어 줍니다.

조건에 맞는 데이터를 찾아야 하는 횟수가 많을 경우, for문을 활용할 수 있습니다.

In [1]:

import pandas as pd

df = pd.read_csv('./enrolment_1.csv')

In [2]:

df['status'] = 'allowed'
df.head(5)

Out[2]:

	id	year	course name	status
0	2777729	1	science	allowed
1	2777730	2	science	allowed
2	2777765	1	arts	allowed
3	2777766	2	arts	allowed
4	2777785	1	mba	allowed

In [3]:

con1 = (df['course name'] == 'information technology') & (df['year'] == 1)  #'allowed'
df.loc[con1, 'status'] = 'not allowed'
df.status.value_counts()

Out[3]:

allowed        1994
not allowed       6
Name: status, dtype: int64

In [4]:

con2 = (df['course name'] == 'commerce') & (df['year'] == 1)  #'not allowed'
df.loc[con2, 'status'] = 'not allowed'
df.status.value_counts()

Out[4]:

allowed        1956
not allowed      44
Name: status, dtype: int64

In [5]:

# df.groupby(['course name', 'status']) 그룹화 하여 count 한 것을 컬럼으로 추가

df['cnt']= df.groupby(['course name', 'status']).id.transform('count')
df.head(10)

Out[5]:

	id	year	course name	status	cnt
0	2777729	1	science	allowed	124
1	2777730	2	science	allowed	124
2	2777765	1	arts	allowed	158
3	2777766	2	arts	allowed	158
4	2777785	1	mba	allowed	6
5	2777786	2	mba	allowed	6
6	2777793	1	mba 2nd shift	allowed	2
7	2777794	2	mba 2nd shift	allowed	2
8	2777795	1	mca 2nd shift	allowed	3
9	2777796	2	mca 2nd shift	allowed	3

In [6]:

df.loc[df['cnt']<5,'status'] = 'not allowed'
df.status.value_counts()

Out[6]:

allowed        1414
not allowed     586
Name: status, dtype: int64

In [7]:

df_my = df.drop(['cnt'],axis=1)
df_my.status.value_counts()

Out[7]:

allowed        1414
not allowed     586
Name: status, dtype: int64

In [8]:

import pandas as pd

df = pd.read_csv('./enrolment_1.csv')

df['status'] = 'allowed'

con1 = (df['course name'] == 'information technology') & (df['year'] == 1)  #'allowed'
df.loc[con1, 'status'] = 'not allowed'

con2 = (df['course name'] == 'commerce') & (df['year'] == 4)  #'not allowed'
df.loc[con2, 'status'] = 'not allowed'

# df.groupby(['course name', 'status']) 그룹화 하여 count 한 것을 컬럼으로 추가
df['cnt']= df.groupby(['course name', 'status']).id.transform('count')

df.loc[df['cnt']<5,'status'] = 'not allowed'

df_my= df.drop(['cnt'],axis=1)

df_my.status.value_counts()

Out[8]:

allowed        1448
not allowed     552
Name: status, dtype: int64

코드잇 답안

In [9]:

import pandas as pd

df = pd.read_csv('./enrolment_1.csv')
df["status"] = "allowed"

# 조건 1
boolean1 = df["course name"] == "information technology"
boolean2 = df["year"] == 1
df.loc[boolean1 & boolean2, "status"] = "not allowed"

# 조건 2
boolean3= df["course name"] == "commerce"
boolean4= df["year"] == 4
df.loc[boolean3& boolean4, "status"] = "not allowed"

# 조건 3
allowed = df["status"] == "allowed"
course_counts = df.loc[allowed, "course name"].value_counts()
closed_courses = list(course_counts[course_counts < 5].index)
for course in closed_courses:
    df.loc[df["course name"] == course, "status"] = "not allowed"

# 정답 확인
df.status.value_counts()

Out[9]:

allowed        1448
not allowed     552
Name: status, dtype: int64

In [ ]:

저작자표시 비영리 변경금지

'코드잇' 카테고리의 다른 글

[코드잇] 대학교 강의실 배정하기2 (0)	2021.02.03
[코드잇] 대학교 강의실 배정하기1 (0)	2021.02.03
[코드잇 & 머신러닝] 판다스 (0)	2021.02.02
[코드잇 & 머신러닝] 넘파이 (0)	2021.02.02
[코드잇] 숫자 야구 (0)	2021.02.01

dlsalfkd11 코딩코딩

[코드잇] 대학교 수강신청 준비하기

'코드잇' 카테고리의 다른 글

티스토리툴바

[코드잇] 대학교 수강신청 준비하기

'코드잇' 카테고리의 다른 글

'코드잇' Related Articles

티스토리툴바