Compare commits

...

3 Commits

Author SHA1 Message Date
7a823377f4 .. 2025-06-02 08:42:19 +00:00
1ad58eaa81 bot deny, add hearder 2025-05-11 00:51:05 +09:00
084f46edd9 naver크롤링 && wp api를 이용한 임시글 등록구현 2025-01-19 23:58:30 +09:00
23 changed files with 717 additions and 25 deletions

View File

@ -1,9 +1,24 @@
# wp-post-automation
## 목적
* 워드프레스 포스팅 자동화 기능 구현.
## 주요내용
### 프로젝트 소개
2025.01.19 - 네이버 블로그 스크랩 및 워드프레스 임시글 등록 기능 구현
* 초기 기능 구현 완료.
2024.10.04 - 테스트 완료. 프로젝트 1차 종료.
* 워드프레스 포스팅 자동화 프로젝트.
* make.com을 이용해서 만든 AutoMation Flow를 Python을 이용하여 변환.
## 기존 Flow
### 사용방법
* sample.env.dev를 복사하여 루트 디렉토리에 .env.prd | .env.dev | .env 중 하나로 생성하여 사용 합니다.
* main.py가 기본 자동화 프로세스 입니다.
* 크롤링된 게시물을 OpenAI가 변형하여 마크다운으로 등록합니다.
* main_naver_blog_html은 네이버 블로그 전용 프로세스 입니다.
* 크롤링된 게시물을 그대로 마크다운으로 등록합니다.
---
## 참고 워크 플로우
* MariaDB에 저장된 최신 참고 url정보를 얻어온다.
* HTTP모듈을 이용하여 참고 자료를 가져온다.
* 가져온 HTML형태의 자료를 Text만 추출한다.
@ -15,35 +30,48 @@
* WordPress에 포스팅을 한다.
## 개발 계획
### 2025.01.19
* 네이버 블로그 포스트를 크롤링한다.
* API로 워드프레스에 임시글로 등록한다.
### 2024.10.04
* 기존 Flow를 Python으로 개발한다.
* 트리거가 발생하면 실행시키는 컨테이너로 빌드한다.
* kubectl create -f file.yaml을 이용하여 1회성 동작 하도록 구현한다.
### Python 개발 순서
## Python 개발 순서
### 2025.01.19
* 현재 DB연동 기능은 없음. 실행시키면 URL을 넣어야동작합니다(완료).
* url을 이용해서 파싱하고 텍스트만 추출하는 기능 구현(완료).
* 마크다운 형태로 추출(완료).
* HTML문서 변환 코드 작성(완료).
* 워드프레스 API를 이용한 임시 포스트 등록(완료).
### 2024.10.04
* DB에서 url을 가져오는 코드작성(완료).
* url을 이용해서 파싱하고 텍스트만 추출하는 기능 구현(완료).
* OpenAI이용 코드 작성(완료)-비용 절감을 위하여 제목, 이미지 생성 제외.
* HTML문서 변환 코드 작성(완료).
* 워드프레스 등록 플로우 코드 작성(완료).
### 코드 리팩토링.
## 코드 리팩토링.
### 2024.10.04
* 전체 리팩토링(완료).
* 모듈화, 패키지화(완료).
## Docker Image Build
2024.10.04 업데이트
### 2024.10.04 업데이트
* Dockerfile 추가(완료).
## Kubernetes manifests
2024.10.04 업데이트
### 2024.10.04 업데이트
* 샘플 템플릿 작성(완료)
* 쿠버네티스 환경 테스트(완료).
---
## 코드 이슈
### 네이버 블로그 크롤링
2024.10.02 기준
* 현재 일반 뉴스 기사는 잘 동작되는 것으로 보임.
* 네이버 블로그는 js이슈로 크롤링이 안되는 것으로 추측. selenium검토 필요.
* 특이사항 없음.
---
## 라이선스
### 라이선스 검토 대상

View File

@ -0,0 +1,37 @@
import requests
from bs4 import BeautifulSoup
def get_naver_blog_content(url):
# 네이버 블로그의 모바일 버전으로 리다이렉트
mobile_url = url.replace("blog.naver.com", "m.blog.naver.com")
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
}
# HTTP 요청
response = requests.get(mobile_url, headers=headers)
if response.status_code != 200:
print(f"Failed to fetch the page: {response.status_code}")
return None
# BeautifulSoup으로 HTML 파싱
soup = BeautifulSoup(response.text, 'html.parser')
# 본문 추출 (모바일 버전의 본문 클래스 사용)
content = soup.find("div", class_="se-main-container")
if content:
return content.get_text(strip=True)
else:
print("Failed to extract the blog content.")
return None
# 예제 URL
url = "https://blog.naver.com/kte1909/223724132196"
blog_content = get_naver_blog_content(url)
if blog_content:
print("Blog Content:")
print(blog_content)

247
bin/Activate.ps1 Normal file
View File

@ -0,0 +1,247 @@
<#
.Synopsis
Activate a Python virtual environment for the current PowerShell session.
.Description
Pushes the python executable for a virtual environment to the front of the
$Env:PATH environment variable and sets the prompt to signify that you are
in a Python virtual environment. Makes use of the command line switches as
well as the `pyvenv.cfg` file values present in the virtual environment.
.Parameter VenvDir
Path to the directory that contains the virtual environment to activate. The
default value for this is the parent of the directory that the Activate.ps1
script is located within.
.Parameter Prompt
The prompt prefix to display when this virtual environment is activated. By
default, this prompt is the name of the virtual environment folder (VenvDir)
surrounded by parentheses and followed by a single space (ie. '(.venv) ').
.Example
Activate.ps1
Activates the Python virtual environment that contains the Activate.ps1 script.
.Example
Activate.ps1 -Verbose
Activates the Python virtual environment that contains the Activate.ps1 script,
and shows extra information about the activation as it executes.
.Example
Activate.ps1 -VenvDir C:\Users\MyUser\Common\.venv
Activates the Python virtual environment located in the specified location.
.Example
Activate.ps1 -Prompt "MyPython"
Activates the Python virtual environment that contains the Activate.ps1 script,
and prefixes the current prompt with the specified string (surrounded in
parentheses) while the virtual environment is active.
.Notes
On Windows, it may be required to enable this Activate.ps1 script by setting the
execution policy for the user. You can do this by issuing the following PowerShell
command:
PS C:\> Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
For more information on Execution Policies:
https://go.microsoft.com/fwlink/?LinkID=135170
#>
Param(
[Parameter(Mandatory = $false)]
[String]
$VenvDir,
[Parameter(Mandatory = $false)]
[String]
$Prompt
)
<# Function declarations --------------------------------------------------- #>
<#
.Synopsis
Remove all shell session elements added by the Activate script, including the
addition of the virtual environment's Python executable from the beginning of
the PATH variable.
.Parameter NonDestructive
If present, do not remove this function from the global namespace for the
session.
#>
function global:deactivate ([switch]$NonDestructive) {
# Revert to original values
# The prior prompt:
if (Test-Path -Path Function:_OLD_VIRTUAL_PROMPT) {
Copy-Item -Path Function:_OLD_VIRTUAL_PROMPT -Destination Function:prompt
Remove-Item -Path Function:_OLD_VIRTUAL_PROMPT
}
# The prior PYTHONHOME:
if (Test-Path -Path Env:_OLD_VIRTUAL_PYTHONHOME) {
Copy-Item -Path Env:_OLD_VIRTUAL_PYTHONHOME -Destination Env:PYTHONHOME
Remove-Item -Path Env:_OLD_VIRTUAL_PYTHONHOME
}
# The prior PATH:
if (Test-Path -Path Env:_OLD_VIRTUAL_PATH) {
Copy-Item -Path Env:_OLD_VIRTUAL_PATH -Destination Env:PATH
Remove-Item -Path Env:_OLD_VIRTUAL_PATH
}
# Just remove the VIRTUAL_ENV altogether:
if (Test-Path -Path Env:VIRTUAL_ENV) {
Remove-Item -Path env:VIRTUAL_ENV
}
# Just remove VIRTUAL_ENV_PROMPT altogether.
if (Test-Path -Path Env:VIRTUAL_ENV_PROMPT) {
Remove-Item -Path env:VIRTUAL_ENV_PROMPT
}
# Just remove the _PYTHON_VENV_PROMPT_PREFIX altogether:
if (Get-Variable -Name "_PYTHON_VENV_PROMPT_PREFIX" -ErrorAction SilentlyContinue) {
Remove-Variable -Name _PYTHON_VENV_PROMPT_PREFIX -Scope Global -Force
}
# Leave deactivate function in the global namespace if requested:
if (-not $NonDestructive) {
Remove-Item -Path function:deactivate
}
}
<#
.Description
Get-PyVenvConfig parses the values from the pyvenv.cfg file located in the
given folder, and returns them in a map.
For each line in the pyvenv.cfg file, if that line can be parsed into exactly
two strings separated by `=` (with any amount of whitespace surrounding the =)
then it is considered a `key = value` line. The left hand string is the key,
the right hand is the value.
If the value starts with a `'` or a `"` then the first and last character is
stripped from the value before being captured.
.Parameter ConfigDir
Path to the directory that contains the `pyvenv.cfg` file.
#>
function Get-PyVenvConfig(
[String]
$ConfigDir
) {
Write-Verbose "Given ConfigDir=$ConfigDir, obtain values in pyvenv.cfg"
# Ensure the file exists, and issue a warning if it doesn't (but still allow the function to continue).
$pyvenvConfigPath = Join-Path -Resolve -Path $ConfigDir -ChildPath 'pyvenv.cfg' -ErrorAction Continue
# An empty map will be returned if no config file is found.
$pyvenvConfig = @{ }
if ($pyvenvConfigPath) {
Write-Verbose "File exists, parse `key = value` lines"
$pyvenvConfigContent = Get-Content -Path $pyvenvConfigPath
$pyvenvConfigContent | ForEach-Object {
$keyval = $PSItem -split "\s*=\s*", 2
if ($keyval[0] -and $keyval[1]) {
$val = $keyval[1]
# Remove extraneous quotations around a string value.
if ("'""".Contains($val.Substring(0, 1))) {
$val = $val.Substring(1, $val.Length - 2)
}
$pyvenvConfig[$keyval[0]] = $val
Write-Verbose "Adding Key: '$($keyval[0])'='$val'"
}
}
}
return $pyvenvConfig
}
<# Begin Activate script --------------------------------------------------- #>
# Determine the containing directory of this script
$VenvExecPath = Split-Path -Parent $MyInvocation.MyCommand.Definition
$VenvExecDir = Get-Item -Path $VenvExecPath
Write-Verbose "Activation script is located in path: '$VenvExecPath'"
Write-Verbose "VenvExecDir Fullname: '$($VenvExecDir.FullName)"
Write-Verbose "VenvExecDir Name: '$($VenvExecDir.Name)"
# Set values required in priority: CmdLine, ConfigFile, Default
# First, get the location of the virtual environment, it might not be
# VenvExecDir if specified on the command line.
if ($VenvDir) {
Write-Verbose "VenvDir given as parameter, using '$VenvDir' to determine values"
}
else {
Write-Verbose "VenvDir not given as a parameter, using parent directory name as VenvDir."
$VenvDir = $VenvExecDir.Parent.FullName.TrimEnd("\\/")
Write-Verbose "VenvDir=$VenvDir"
}
# Next, read the `pyvenv.cfg` file to determine any required value such
# as `prompt`.
$pyvenvCfg = Get-PyVenvConfig -ConfigDir $VenvDir
# Next, set the prompt from the command line, or the config file, or
# just use the name of the virtual environment folder.
if ($Prompt) {
Write-Verbose "Prompt specified as argument, using '$Prompt'"
}
else {
Write-Verbose "Prompt not specified as argument to script, checking pyvenv.cfg value"
if ($pyvenvCfg -and $pyvenvCfg['prompt']) {
Write-Verbose " Setting based on value in pyvenv.cfg='$($pyvenvCfg['prompt'])'"
$Prompt = $pyvenvCfg['prompt'];
}
else {
Write-Verbose " Setting prompt based on parent's directory's name. (Is the directory name passed to venv module when creating the virtual environment)"
Write-Verbose " Got leaf-name of $VenvDir='$(Split-Path -Path $venvDir -Leaf)'"
$Prompt = Split-Path -Path $venvDir -Leaf
}
}
Write-Verbose "Prompt = '$Prompt'"
Write-Verbose "VenvDir='$VenvDir'"
# Deactivate any currently active virtual environment, but leave the
# deactivate function in place.
deactivate -nondestructive
# Now set the environment variable VIRTUAL_ENV, used by many tools to determine
# that there is an activated venv.
$env:VIRTUAL_ENV = $VenvDir
if (-not $Env:VIRTUAL_ENV_DISABLE_PROMPT) {
Write-Verbose "Setting prompt to '$Prompt'"
# Set the prompt to include the env name
# Make sure _OLD_VIRTUAL_PROMPT is global
function global:_OLD_VIRTUAL_PROMPT { "" }
Copy-Item -Path function:prompt -Destination function:_OLD_VIRTUAL_PROMPT
New-Variable -Name _PYTHON_VENV_PROMPT_PREFIX -Description "Python virtual environment prompt prefix" -Scope Global -Option ReadOnly -Visibility Public -Value $Prompt
function global:prompt {
Write-Host -NoNewline -ForegroundColor Green "($_PYTHON_VENV_PROMPT_PREFIX) "
_OLD_VIRTUAL_PROMPT
}
$env:VIRTUAL_ENV_PROMPT = $Prompt
}
# Clear PYTHONHOME
if (Test-Path -Path Env:PYTHONHOME) {
Copy-Item -Path Env:PYTHONHOME -Destination Env:_OLD_VIRTUAL_PYTHONHOME
Remove-Item -Path Env:PYTHONHOME
}
# Add the venv to the PATH
Copy-Item -Path Env:PATH -Destination Env:_OLD_VIRTUAL_PATH
$Env:PATH = "$VenvExecDir$([System.IO.Path]::PathSeparator)$Env:PATH"

69
bin/activate Normal file
View File

@ -0,0 +1,69 @@
# This file must be used with "source bin/activate" *from bash*
# you cannot run it directly
deactivate () {
# reset old environment variables
if [ -n "${_OLD_VIRTUAL_PATH:-}" ] ; then
PATH="${_OLD_VIRTUAL_PATH:-}"
export PATH
unset _OLD_VIRTUAL_PATH
fi
if [ -n "${_OLD_VIRTUAL_PYTHONHOME:-}" ] ; then
PYTHONHOME="${_OLD_VIRTUAL_PYTHONHOME:-}"
export PYTHONHOME
unset _OLD_VIRTUAL_PYTHONHOME
fi
# This should detect bash and zsh, which have a hash command that must
# be called to get it to forget past commands. Without forgetting
# past commands the $PATH changes we made may not be respected
if [ -n "${BASH:-}" -o -n "${ZSH_VERSION:-}" ] ; then
hash -r 2> /dev/null
fi
if [ -n "${_OLD_VIRTUAL_PS1:-}" ] ; then
PS1="${_OLD_VIRTUAL_PS1:-}"
export PS1
unset _OLD_VIRTUAL_PS1
fi
unset VIRTUAL_ENV
unset VIRTUAL_ENV_PROMPT
if [ ! "${1:-}" = "nondestructive" ] ; then
# Self destruct!
unset -f deactivate
fi
}
# unset irrelevant variables
deactivate nondestructive
VIRTUAL_ENV=/home/ubuntu/gitea-icurfer/wp-post-automation
export VIRTUAL_ENV
_OLD_VIRTUAL_PATH="$PATH"
PATH="$VIRTUAL_ENV/"bin":$PATH"
export PATH
# unset PYTHONHOME if set
# this will fail if PYTHONHOME is set to the empty string (which is bad anyway)
# could use `if (set -u; : $PYTHONHOME) ;` in bash
if [ -n "${PYTHONHOME:-}" ] ; then
_OLD_VIRTUAL_PYTHONHOME="${PYTHONHOME:-}"
unset PYTHONHOME
fi
if [ -z "${VIRTUAL_ENV_DISABLE_PROMPT:-}" ] ; then
_OLD_VIRTUAL_PS1="${PS1:-}"
PS1='(wp-post-automation) '"${PS1:-}"
export PS1
VIRTUAL_ENV_PROMPT='(wp-post-automation) '
export VIRTUAL_ENV_PROMPT
fi
# This should detect bash and zsh, which have a hash command that must
# be called to get it to forget past commands. Without forgetting
# past commands the $PATH changes we made may not be respected
if [ -n "${BASH:-}" -o -n "${ZSH_VERSION:-}" ] ; then
hash -r 2> /dev/null
fi

26
bin/activate.csh Normal file
View File

@ -0,0 +1,26 @@
# This file must be used with "source bin/activate.csh" *from csh*.
# You cannot run it directly.
# Created by Davide Di Blasi <davidedb@gmail.com>.
# Ported to Python 3.3 venv by Andrew Svetlov <andrew.svetlov@gmail.com>
alias deactivate 'test $?_OLD_VIRTUAL_PATH != 0 && setenv PATH "$_OLD_VIRTUAL_PATH" && unset _OLD_VIRTUAL_PATH; rehash; test $?_OLD_VIRTUAL_PROMPT != 0 && set prompt="$_OLD_VIRTUAL_PROMPT" && unset _OLD_VIRTUAL_PROMPT; unsetenv VIRTUAL_ENV; unsetenv VIRTUAL_ENV_PROMPT; test "\!:*" != "nondestructive" && unalias deactivate'
# Unset irrelevant variables.
deactivate nondestructive
setenv VIRTUAL_ENV /home/ubuntu/gitea-icurfer/wp-post-automation
set _OLD_VIRTUAL_PATH="$PATH"
setenv PATH "$VIRTUAL_ENV/"bin":$PATH"
set _OLD_VIRTUAL_PROMPT="$prompt"
if (! "$?VIRTUAL_ENV_DISABLE_PROMPT") then
set prompt = '(wp-post-automation) '"$prompt"
setenv VIRTUAL_ENV_PROMPT '(wp-post-automation) '
endif
alias pydoc python -m pydoc
rehash

69
bin/activate.fish Normal file
View File

@ -0,0 +1,69 @@
# This file must be used with "source <venv>/bin/activate.fish" *from fish*
# (https://fishshell.com/); you cannot run it directly.
function deactivate -d "Exit virtual environment and return to normal shell environment"
# reset old environment variables
if test -n "$_OLD_VIRTUAL_PATH"
set -gx PATH $_OLD_VIRTUAL_PATH
set -e _OLD_VIRTUAL_PATH
end
if test -n "$_OLD_VIRTUAL_PYTHONHOME"
set -gx PYTHONHOME $_OLD_VIRTUAL_PYTHONHOME
set -e _OLD_VIRTUAL_PYTHONHOME
end
if test -n "$_OLD_FISH_PROMPT_OVERRIDE"
set -e _OLD_FISH_PROMPT_OVERRIDE
# prevents error when using nested fish instances (Issue #93858)
if functions -q _old_fish_prompt
functions -e fish_prompt
functions -c _old_fish_prompt fish_prompt
functions -e _old_fish_prompt
end
end
set -e VIRTUAL_ENV
set -e VIRTUAL_ENV_PROMPT
if test "$argv[1]" != "nondestructive"
# Self-destruct!
functions -e deactivate
end
end
# Unset irrelevant variables.
deactivate nondestructive
set -gx VIRTUAL_ENV /home/ubuntu/gitea-icurfer/wp-post-automation
set -gx _OLD_VIRTUAL_PATH $PATH
set -gx PATH "$VIRTUAL_ENV/"bin $PATH
# Unset PYTHONHOME if set.
if set -q PYTHONHOME
set -gx _OLD_VIRTUAL_PYTHONHOME $PYTHONHOME
set -e PYTHONHOME
end
if test -z "$VIRTUAL_ENV_DISABLE_PROMPT"
# fish uses a function instead of an env var to generate the prompt.
# Save the current fish_prompt function as the function _old_fish_prompt.
functions -c fish_prompt _old_fish_prompt
# With the original prompt function renamed, we can override with our own.
function fish_prompt
# Save the return status of the last command.
set -l old_status $status
# Output the venv prompt; color taken from the blue of the Python logo.
printf "%s%s%s" (set_color 4B8BBE) '(wp-post-automation) ' (set_color normal)
# Restore the return status of the previous command.
echo "exit $old_status" | .
# Output the original/"old" prompt.
_old_fish_prompt
end
set -gx _OLD_FISH_PROMPT_OVERRIDE "$VIRTUAL_ENV"
set -gx VIRTUAL_ENV_PROMPT '(wp-post-automation) '
end

8
bin/pip Executable file
View File

@ -0,0 +1,8 @@
#!/home/ubuntu/gitea-icurfer/wp-post-automation/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from pip._internal.cli.main import main
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(main())

8
bin/pip3 Executable file
View File

@ -0,0 +1,8 @@
#!/home/ubuntu/gitea-icurfer/wp-post-automation/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from pip._internal.cli.main import main
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(main())

8
bin/pip3.10 Executable file
View File

@ -0,0 +1,8 @@
#!/home/ubuntu/gitea-icurfer/wp-post-automation/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from pip._internal.cli.main import main
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(main())

1
bin/python Symbolic link
View File

@ -0,0 +1 @@
python3

1
bin/python3 Symbolic link
View File

@ -0,0 +1 @@
/usr/bin/python3

1
bin/python3.10 Symbolic link
View File

@ -0,0 +1 @@
python3

79
dev.py Normal file
View File

@ -0,0 +1,79 @@
import requests
from bs4 import BeautifulSoup
from markdownify import markdownify as md
from package import GetConfig, MariaDB, ChangeTextToPost, WordPress
import markdown
# 현재 DB연동 기능은 없음. 실행시키면 URL을 넣어야동작합니다.
def get_naver_blog_content_as_markdown(url):
# 네이버 블로그의 모바일 버전으로 리다이렉트
mobile_url = url.replace("blog.naver.com", "m.blog.naver.com")
# 웹브라우저 위장 --------------------------------------------------
# 제외 하여도 이상 없이 동작하여 제외.
# headers = {
# "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
# }
# response = requests.get(mobile_url, headers=headers)
# ---------------------------------------------------------------
response = requests.get(mobile_url)
if response.status_code != 200:
print(f"Failed to fetch the page: {response.status_code}")
return None
# BeautifulSoup으로 HTML 파싱
soup = BeautifulSoup(response.text, 'html.parser')
# 본문 추출 (모바일 버전의 본문 클래스 사용)
content = soup.find("div", class_="se-main-container")
if content:
html_content = str(content)
markdown_content = md(html_content) # HTML → Markdown 변환
# 빈 줄 제거
markdown_content = "\n".join([line for line in markdown_content.splitlines() if line.strip()])
return markdown_content
else:
print("Failed to extract the blog content.")
return None
# 2024-10-03 환경 변수 호출
print('### Get values From .env')
config = GetConfig()
dict_data = config.get_config_as_dict()
# 예제 URL
url = input("Enter your blog address : ")
# markdown_content = get_naver_blog_content_as_markdown(url)
post_article = get_naver_blog_content_as_markdown(url)
post_article = post_article.replace(">", "###")
# if markdown_content:
# print("Markdown Content:")
# print(markdown_content)
# Markdown 파일로 저장
# with open("blog_content.md", "w", encoding="utf-8") as file:
# file.write(markdown_content)
# print("Blog content saved as blog_content.md")
# print('### Convert to HTML - markdown to html')
# # 2024-10-03 Markdown을 HTML로 변환
# html = markdown.markdown(post_article)
# # 2024-10-03 워드프레스 포스팅 임시등록
# print('### Create post')
# wp = WordPress(dict_data)
# rs = wp.create_post(2,html)
# if __name__ == "__main__":
# # print(post_article)
# print("추가 확인을 위한 출력")
# if rs.ok:
# print(f"### 성공 code:{rs.status_code}")
# else:
# print(f"### 실패 code:{rs.status_code} reason:{rs.reason} msg:{rs.text}")

1
lib64 Symbolic link
View File

@ -0,0 +1 @@
lib

31
main.py
View File

@ -8,33 +8,42 @@ config = GetConfig()
dict_data = config.get_config_as_dict()
# 2024-10-03 db에서 url정보 호출
# DB없이 url을 직접 넣어서 동작시켜도 가능합니다. - 2025.01.19
print('### Get URL From DB')
db = MariaDB(dict_data)
url = db.fetch_data_from_mariadb()['url']
url = db.fetch_data_from_mariadb()['url'] # 최근 항목 조회. - 2025.01.19
print(url)
# 2024-10-03 url을 이용해서 text추출
print('### Get content From URL')
origin_content = pkg.getContents(url)
# 2024-10-03 openAI를 이용하여 게시글 스타일 변경
print(origin_content)
# # 2024-10-03 openAI를 이용하여 게시글 스타일 변경
print('### Convert to Post - openAI')
openai_key = dict_data['openai_api_key']
print(f"### OpenAI Key : {openai_key}")
wp_reference_style = dict_data['wp_post_style']
print(f"### WP Reference Style : {wp_reference_style}")
open_ai = ChangeTextToPost(openai_key)
post_article = open_ai.generate_blog_post(origin_content, wp_reference_style)
print('### DEBUG ###')
print(post_article)
# print('### Convert to HTML - markdown to html')
# # 2024-10-03 Markdown을 HTML로 변환
# html = markdown.markdown(post_article)
print('### Convert to HTML - markdown to html')
# 2024-10-03 Markdown을 HTML로 변환
html = markdown.markdown(post_article)
# 2024-10-03 워드프레스 포스팅 임시등록
print('### Create post')
wp = WordPress(dict_data)
rs = wp.create_post(2,html)
# # 2024-10-03 워드프레스 포스팅 임시등록
# print('### Create post')
# wp = WordPress(dict_data)
# rs = wp.create_post(2,html)
if __name__ == "__main__":
# print(post_article)
print(post_article)
print("추가 확인을 위한 출력")
if rs.ok:
print(f"### 성공 code:{rs.status_code}")

81
main_naver_blog_html.py Normal file
View File

@ -0,0 +1,81 @@
import requests
from bs4 import BeautifulSoup
from markdownify import markdownify as md
from package import GetConfig, MariaDB, ChangeTextToPost, WordPress
import markdown
import re
# 현재 DB연동 기능은 없음. 실행시키면 URL을 넣어야동작합니다.
def get_naver_blog_content_as_markdown(url):
# 네이버 블로그의 모바일 버전으로 리다이렉트
mobile_url = url.replace("blog.naver.com", "m.blog.naver.com")
# 웹브라우저 위장 --------------------------------------------------
# 제외 하여도 이상 없이 동작하여 제외.
# headers = {
# "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
# }
# response = requests.get(mobile_url, headers=headers)
# ---------------------------------------------------------------
response = requests.get(mobile_url)
if response.status_code != 200:
print(f"Failed to fetch the page: {response.status_code}")
return None
# BeautifulSoup으로 HTML 파싱
soup = BeautifulSoup(response.text, 'html.parser')
# 본문 추출 (모바일 버전의 본문 클래스 사용)
content = soup.find("div", class_="se-main-container")
if content:
html_content = str(content)
markdown_content = md(html_content) # HTML → Markdown 변환
# 빈 줄 제거
markdown_content = "\n".join([line for line in markdown_content.splitlines() if line.strip()])
return markdown_content
else:
print("Failed to extract the blog content.")
return None
# 2024-10-03 환경 변수 호출
print('### Get values From .env')
config = GetConfig()
dict_data = config.get_config_as_dict()
# 예제 URL
url = input("Enter your blog address : ")
# markdown_content = get_naver_blog_content_as_markdown(url)
post_article = get_naver_blog_content_as_markdown(url)
post_article = post_article.replace(">", "###")
post_article = re.sub(r"^\[!\[\].*?\]", "#### 이미지", post_article, flags=re.MULTILINE)
# if markdown_content:
# print("Markdown Content:")
# print(markdown_content)
# Markdown 파일로 저장
# with open("blog_content.md", "w", encoding="utf-8") as file:
# file.write(markdown_content)
# print("Blog content saved as blog_content.md")
print('### Convert to HTML - markdown to html')
# 2024-10-03 Markdown을 HTML로 변환
html = markdown.markdown(post_article)
# 2024-10-03 워드프레스 포스팅 임시등록
print('### Create post')
wp = WordPress(dict_data)
rs = wp.create_post(2,html)
if __name__ == "__main__":
# print(post_article)
print("추가 확인을 위한 출력")
if rs.ok:
print(f"### 성공 code:{rs.status_code}")
else:
print(f"### 실패 code:{rs.status_code} reason:{rs.reason} msg:{rs.text}")

View File

@ -3,10 +3,13 @@ from dotenv import load_dotenv
# 우선순위: .env.prd > .env.dev > .env
if os.path.exists('.env.prd'):
print("Read ::: .env.prd")
load_dotenv('.env.prd')
elif os.path.exists('.env.dev'):
print("Read ::: .env.dev")
load_dotenv('.env.dev')
else:
print("Read ::: .env")
load_dotenv('.env') # 기본 .env 파일
class GetConfig:

View File

@ -4,8 +4,15 @@ from bs4 import BeautifulSoup
from datetime import datetime
def getContents(url):
# ✅ User-Agent 헤더 추가 (403 방지용)
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) ' \
'AppleWebKit/537.36 (KHTML, like Gecko) ' \
'Chrome/113.0.0.0 Safari/537.36'
}
# HTTP GET 요청으로 페이지 가져오기
response = requests.get(url)
response = requests.get(url, headers=headers)
# 응답 상태 확인
if response.status_code == 200:
@ -14,6 +21,8 @@ def getContents(url):
# HTML 태그를 제거 후 페이지의 모든 텍스트 가져오기 (전체 내용)
page_content = soup.get_text()
print("### url DEBUG ###")
print(page_content)
# 빈 줄을 제거하고 텍스트만 출력 (줄바꿈 문자를 기준으로 필터링)
lines = [line.strip() for line in page_content.splitlines() if line.strip()]
@ -29,7 +38,6 @@ class WordPress():
def __init__(self, dict):
self.wp_url = dict['wp_url']
self.wp_user = dict['wp_user']
self.wp_user = dict['wp_user']
self.wp_api_key = dict['wp_api_key']
def create_post(self, category_id, content, media_id = None, status = "draft", title="파이썬 자동 포스팅"):
@ -55,7 +63,7 @@ class WordPress():
# print(f"실패 code:{result.status_code} reason:{result.reason} msg:{result.text}")
if __name__ == "__main__":
# url = 'example_url'
# url = 'https://www.hani.co.kr/arti/science/science_general/1161001.html'
# tmp = getContents(url)
# print(tmp)
pass

3
pyvenv.cfg Normal file
View File

@ -0,0 +1,3 @@
home = /usr/bin
include-system-site-packages = false
version = 3.10.12

View File

@ -3,6 +3,7 @@ anyio==4.6.0
beautifulsoup4==4.12.3
certifi==2024.8.30
charset-normalizer==3.3.2
chromedriver-py==132.0.6834.83
colorama==0.4.6
distro==1.9.0
exceptiongroup==1.2.2
@ -12,12 +13,14 @@ httpx==0.27.2
idna==3.10
jiter==0.5.0
Markdown==3.7
markdownify==0.14.1
mysql-connector-python==9.0.0
openai==1.51.0
pydantic==2.9.2
pydantic_core==2.23.4
python-dotenv==1.0.1
requests==2.32.3
six==1.17.0
sniffio==1.3.1
soupsieve==2.6
tqdm==4.66.5

View File

@ -6,4 +6,4 @@ OPENAI_API_KEY=demo
WP_URL='https://www.example.com'
WP_USER='demo'
WP_API_KEY='demo'
WP_POST_STYLE="문장"
WP_POST_STYLE="문장" # OpenAI에 사용되는것.

2
tempCodeRunnerFile.py Normal file
View File

@ -0,0 +1,2 @@
if rs.ok:

View File

@ -1 +1 @@
0.1.0
0.1.2